• Arianna Avanzini's avatar
    block, bfq: add full hierarchical scheduling and cgroups support · e21b7a0b
    Arianna Avanzini authored
    
    
    Add complete support for full hierarchical scheduling, with a cgroups
    interface. Full hierarchical scheduling is implemented through the
    'entity' abstraction: both bfq_queues, i.e., the internal BFQ queues
    associated with processes, and groups are represented in general by
    entities. Given the bfq_queues associated with the processes belonging
    to a given group, the entities representing these queues are sons of
    the entity representing the group. At higher levels, if a group, say
    G, contains other groups, then the entity representing G is the parent
    entity of the entities representing the groups in G.
    
    Hierarchical scheduling is performed as follows: if the timestamps of
    a leaf entity (i.e., of a bfq_queue) change, and such a change lets
    the entity become the next-to-serve entity for its parent entity, then
    the timestamps of the parent entity are recomputed as a function of
    the budget of its new next-to-serve leaf entity. If the parent entity
    belongs, in its turn, to a group, and its new timestamps let it become
    the next-to-serve for its parent entity, then the timestamps of the
    latter parent entity are recomputed as well, and so on. When a new
    bfq_queue must be set in service, the reverse path is followed: the
    next-to-serve highest-level entity is chosen, then its next-to-serve
    child entity, and so on, until the next-to-serve leaf entity is
    reached, and the bfq_queue that this entity represents is set in
    service.
    
    Writeback is accounted for on a per-group basis, i.e., for each group,
    the async I/O requests of the processes of the group are enqueued in a
    distinct bfq_queue, and the entity associated with this queue is a
    child of the entity associated with the group.
    
    Weights can be assigned explicitly to groups and processes through the
    cgroups interface, differently from what happens, for single
    processes, if the cgroups interface is not used (as explained in the
    description of the previous patch). In particular, since each node has
    a full scheduler, each group can be assigned its own weight.
    
    Signed-off-by: default avatarFabio Checconi <fchecconi@gmail.com>
    Signed-off-by: default avatarPaolo Valente <paolo.valente@linaro.org>
    Signed-off-by: default avatarArianna Avanzini <avanzini.arianna@gmail.com>
    Signed-off-by: default avatarJens Axboe <axboe@fb.com>
    e21b7a0b