1. 15 Jun, 2018 1 commit
  2. 05 Nov, 2017 2 commits
  3. 02 Nov, 2017 1 commit
    • Greg Kroah-Hartman's avatar
      License cleanup: add SPDX GPL-2.0 license identifier to files with no license · b2441318
      Greg Kroah-Hartman authored
      Many source files in the tree are missing licensing information, which
      makes it harder for compliance tools to determine the correct license.
      
      By default all files without license information are under the default
      license of the kernel, which is GPL version 2.
      
      Update the files which contain no license information with the 'GPL-2.0'
      SPDX license identifier.  The SPDX identifier is a legally binding
      shorthand, which can be used instead of the full boiler plate text.
      
      This patch is based on work done by Thomas Gleixner and Kate Stewart and
      Philippe Ombredanne.
      
      How this work was done:
      
      Patches were generated and checked against linux-4.14-rc6 for a subset of
      the use cases:
       - file had no licensing information it it.
       - file was a */uapi/* one with no licensing information in it,
       - file was a */uapi/* one with existing licensing information,
      
      Further patches will be generated in subsequent months to fix up cases
      where non-standard license headers were used, and references to license
      had to be inferred by heuristics based on keywords.
      
      The analysis to determine which SPDX License Identifier to be applied to
      a file was done in a spreadsheet of side by side results from of the
      output of two independent scanners (ScanCode & Windriver) producing SPDX
      tag:value files created by Philippe Ombredanne.  Philippe prepared the
      base worksheet, and did an initial spot review of a few 1000 files.
      
      The 4.13 kernel was the starting point of the analysis with 60,537 files
      assessed.  Kate Stewart did a file by file comparison of the scanner
      results in the spreadsheet to determine which SPDX license identifier(s)
      to be applied to the file. She confirmed any determination that was not
      immediately clear with lawyers working with the Linux Foundation.
      
      Criteria used to select files for SPDX license identifier tagging was:
       - Files considered eligible had to be source code files.
       - Make and config files were included as candidates if they contained >5
         lines of source
       - File already had some variant of a license header in it (even if <5
         lines).
      
      All documentation files were explicitly excluded.
      
      The following heuristics were used to determine which SPDX license
      identifiers to apply.
      
       - when both scanners couldn't find any license traces, file was
         considered to have no license information in it, and the top level
         COPYING file license applied.
      
         For non */uapi/* files that summary was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0                                              11139
      
         and resulted in the first patch in this series.
      
         If that file was a */uapi/* path one, it was "GPL-2.0 WITH
         Linux-syscall-note" otherwise it was "GPL-2.0".  Results of that was:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|-------
         GPL-2.0 WITH Linux-syscall-note                        930
      
         and resulted in the second patch in this series.
      
       - if a file had some form of licensing information in it, and was one
         of the */uapi/* ones, it was denoted with the Linux-syscall-note if
         any GPL family license was found in the file or had no licensing in
         it (per prior point).  Results summary:
      
         SPDX license identifier                            # files
         ---------------------------------------------------|------
         GPL-2.0 WITH Linux-syscall-note                       270
         GPL-2.0+ WITH Linux-syscall-note                      169
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-2-Clause)    21
         ((GPL-2.0 WITH Linux-syscall-note) OR BSD-3-Clause)    17
         LGPL-2.1+ WITH Linux-syscall-note                      15
         GPL-1.0+ WITH Linux-syscall-note                       14
         ((GPL-2.0+ WITH Linux-syscall-note) OR BSD-3-Clause)    5
         LGPL-2.0+ WITH Linux-syscall-note                       4
         LGPL-2.1 WITH Linux-syscall-note                        3
         ((GPL-2.0 WITH Linux-syscall-note) OR MIT)              3
         ((GPL-2.0 WITH Linux-syscall-note) AND MIT)             1
      
         and that resulted in the third patch in this series.
      
       - when the two scanners agreed on the detected license(s), that became
         the concluded license(s).
      
       - when there was disagreement between the two scanners (one detected a
         license but the other didn't, or they both detected different
         licenses) a manual inspection of the file occurred.
      
       - In most cases a manual inspection of the information in the file
         resulted in a clear resolution of the license that should apply (and
         which scanner probably needed to revisit its heuristics).
      
       - When it was not immediately clear, the license identifier was
         confirmed with lawyers working with the Linux Foundation.
      
       - If there was any question as to the appropriate license identifier,
         the file was flagged for further research and to be revisited later
         in time.
      
      In total, over 70 hours of logged manual review was done on the
      spreadsheet to determine the SPDX license identifiers to apply to the
      source files by Kate, Philippe, Thomas and, in some cases, confirmation
      by lawyers working with the Linux Foundation.
      
      Kate also obtained a third independent scan of the 4.13 code base from
      FOSSology, and compared selected files where the other two scanners
      disagreed against that SPDX file, to see if there was new insights.  The
      Windriver scanner is based on an older version of FOSSology in part, so
      they are related.
      
      Thomas did random spot checks in about 500 files from the spreadsheets
      for the uapi headers and agreed with SPDX license identifier in the
      files he inspected. For the non-uapi files Thomas did random spot checks
      in about 15000 files.
      
      In initial set of patches against 4.14-rc6, 3 files were found to have
      copy/paste license identifier errors, and have been fixed to reflect the
      correct identifier.
      
      Additionally Philippe spent 10 hours this week doing a detailed manual
      inspection and review of the 12,461 patched files from the initial patch
      version early this week with:
       - a full scancode scan run, collecting the matched texts, detected
         license ids and scores
       - reviewing anything where there was a license detected (about 500+
         files) to ensure that the applied SPDX license was correct
       - reviewing anything where there was no detection but the patch license
         was not GPL-2.0 WITH Linux-syscall-note to ensure that the applied
         SPDX license was correct
      
      This produced a worksheet with 20 files needing minor correction.  This
      worksheet was then exported into 3 different .csv files for the
      different types of files to be modified.
      
      These .csv files were then reviewed by Greg.  Thomas wrote a script to
      parse the csv files and add the proper SPDX tag to the file, in the
      format that the file expected.  This script was further refined by Greg
      based on the output to detect more types of files automatically and to
      distinguish between header and source .c files (which need different
      comment types.)  Finally Greg ran the script using the .csv files to
      generate the patches.
      Reviewed-by: 's avatarKate Stewart <kstewart@linuxfoundation.org>
      Reviewed-by: 's avatarPhilippe Ombredanne <pombredanne@nexb.com>
      Reviewed-by: 's avatarThomas Gleixner <tglx@linutronix.de>
      Signed-off-by: 's avatarGreg Kroah-Hartman <gregkh@linuxfoundation.org>
      b2441318
  4. 04 Sep, 2015 1 commit
  5. 22 Jul, 2015 1 commit
  6. 15 Jul, 2014 1 commit
    • Tejun Heo's avatar
      cgroup: rename cgroup_subsys->base_cftypes to ->legacy_cftypes · 5577964e
      Tejun Heo authored
      Currently, cgroup_subsys->base_cftypes is used for both the unified
      default hierarchy and legacy ones and subsystems can mark each file
      with either CFTYPE_ONLY_ON_DFL or CFTYPE_INSANE if it has to appear
      only on one of them.  This is quite hairy and error-prone.  Also, we
      may end up exposing interface files to the default hierarchy without
      thinking it through.
      
      cgroup_subsys will grow two separate cftype arrays and apply each only
      on the hierarchies of the matching type.  This will allow organizing
      cftypes in a lot clearer way and encourage subsystems to scrutinize
      the interface which is being exposed in the new default hierarchy.
      
      In preparation, this patch renames cgroup_subsys->base_cftypes to
      cgroup_subsys->legacy_cftypes.  This patch is pure rename.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@kernel.org>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
      5577964e
  7. 16 May, 2014 3 commits
  8. 13 May, 2014 1 commit
    • Tejun Heo's avatar
      cgroup: replace cftype->write_string() with cftype->write() · 451af504
      Tejun Heo authored
      Convert all cftype->write_string() users to the new cftype->write()
      which maps directly to kernfs write operation and has full access to
      kernfs and cgroup contexts.  The conversions are mostly mechanical.
      
      * @css and @cft are accessed using of_css() and of_cft() accessors
        respectively instead of being specified as arguments.
      
      * Should return @nbytes on success instead of 0.
      
      * @buf is not trimmed automatically.  Trim if necessary.  Note that
        blkcg and netprio don't need this as the parsers already handle
        whitespaces.
      
      cftype->write_string() has no user left after the conversions and
      removed.
      
      While at it, remove unnecessary local variable @p in
      cgroup_subtree_control_write() and stale comment about
      CGROUP_LOCAL_BUFFER_SIZE in cgroup_freezer.c.
      
      This patch doesn't introduce any visible behavior changes.
      
      v2: netprio was missing from conversion.  Converted.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarAristeu Rozanski <arozansk@redhat.com>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Michal Hocko <mhocko@suse.cz>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      Cc: "David S. Miller" <davem@davemloft.net>
      451af504
  9. 05 May, 2014 1 commit
    • Aristeu Rozanski's avatar
      device_cgroup: check if exception removal is allowed · d2c2b11c
      Aristeu Rozanski authored
      [PATCH v3 1/2] device_cgroup: check if exception removal is allowed
      
      When the device cgroup hierarchy was introduced in
      	bd2953eb - devcg: propagate local changes down the hierarchy
      
      a specific case was overlooked. Consider the hierarchy bellow:
      
      	A	default policy: ALLOW, exceptions will deny access
      	 \
      	  B	default policy: ALLOW, exceptions will deny access
      
      There's no need to verify when an new exception is added to B because
      in this case exceptions will deny access to further devices, which is
      always fine. Hierarchy in device cgroup only makes sure B won't have
      more access than A.
      
      But when an exception is removed (by writing devices.allow), it isn't
      checked if the user is in fact removing an inherited exception from A,
      thus giving more access to B.
      
      Example:
      
      	# echo 'a' >A/devices.allow
      	# echo 'c 1:3 rw' >A/devices.deny
      	# echo $$ >A/B/tasks
      	# echo >/dev/null
      	-bash: /dev/null: Operation not permitted
      	# echo 'c 1:3 w' >A/B/devices.allow
      	# echo >/dev/null
      	#
      
      This shouldn't be allowed and this patch fixes it by making sure to never allow
      exceptions in this case to be removed if the exception is partially or fully
      present on the parent.
      
      v3: missing '*' in function description
      v2: improved log message and formatting fixes
      
      Cc: cgroups@vger.kernel.org
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarAristeu Rozanski <arozansk@redhat.com>
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      d2c2b11c
  10. 04 May, 2014 1 commit
  11. 21 Apr, 2014 1 commit
    • Aristeu Rozanski's avatar
      device_cgroup: rework device access check and exception checking · 79d71974
      Aristeu Rozanski authored
      Whenever a device file is opened and checked against current device
      cgroup rules, it uses the same function (may_access()) as when a new
      exception rule is added by writing devices.{allow,deny}. And in both
      cases, the algorithm is the same, doesn't matter the behavior.
      
      First problem is having device access to be considered the same as rule
      checking. Consider the following structure:
      
      	A	(default behavior: allow, exceptions disallow access)
      	 \
      	  B	(default behavior: allow, exceptions disallow access)
      
      A new exception is added to B by writing devices.deny:
      
      	c 12:34 rw
      
      When checking if that exception is allowed in may_access():
      
      	if (dev_cgroup->behavior == DEVCG_DEFAULT_ALLOW) {
      		if (behavior == DEVCG_DEFAULT_ALLOW) {
      			/* the exception will deny access to certain devices */
      			return true;
      
      Which is ok, since B is not getting more privileges than A, it doesn't
      matter and the rule is accepted
      
      Now, consider it's a device file open check and the process belongs to
      cgroup B. The access will be generated as:
      
      	behavior: allow
      	exception: c 12:34 rw
      
      The very same chunk of code will allow it, even if there's an explicit
      exception telling to do otherwise.
      
      A simple test case:
      
      	# mkdir new_group
      	# cd new_group
      	# echo $$ >tasks
      	# echo "c 1:3 w" >devices.deny
      	# echo >/dev/null
      	# echo $?
      	0
      
      This is a serious bug and was introduced on
      
      	c39a2a30 devcg: prepare may_access() for hierarchy support
      
      To solve this problem, the device file open function was split from the
      new exception check.
      
      Second problem is how exceptions are processed by may_access(). The
      first part of the said function tries to match fully with an existing
      exception:
      
      	list_for_each_entry_rcu(ex, &dev_cgroup->exceptions, list) {
      		if ((refex->type & DEV_BLOCK) && !(ex->type & DEV_BLOCK))
      			continue;
      		if ((refex->type & DEV_CHAR) && !(ex->type & DEV_CHAR))
      			continue;
      		if (ex->major != ~0 && ex->major != refex->major)
      			continue;
      		if (ex->minor != ~0 && ex->minor != refex->minor)
      			continue;
      		if (refex->access & (~ex->access))
      			continue;
      		match = true;
      		break;
      	}
      
      That means the new exception should be contained into an existing one to
      be considered a match:
      
      	New exception		Existing	match?	notes
      	b 12:34 rwm		b 12:34 rwm	yes
      	b 12:34 r		b *:34 rw	yes
      	b 12:34 rw		b 12:34 w	no	extra "r"
      	b *:34 rw		b 12:34 rw	no	too broad "*"
      	b *:34 rw		b *:34 rwm	yes
      
      Which is fine in some cases. Consider:
      
      	A	(default behavior: deny, exceptions allow access)
      	 \
      	  B	(default behavior: deny, exceptions allow access)
      
      In this case the full match makes sense, the new exception cannot add
      more access than the parent allows
      
      But this doesn't always work, consider:
      
      	A	(default behavior: allow, exceptions disallow access)
      	 \
      	  B	(default behavior: deny, exceptions allow access)
      
      In this case, a new exception in B shouldn't match any of the exceptions
      in A, after all you can't allow something that was forbidden by A. But
      consider this scenario:
      
      	New exception	Existing in A	match?	outcome
      	b 12:34 rw	b 12:34 r	no	exception is accepted
      
      Because the new exception has "w" as extra, it doesn't match, so it'll
      be added to B's exception list.
      
      The same problem can happen during a file access check. Consider a
      cgroup with allow as default behavior:
      
      	Access		Exception	match?
      	b 12:34 rw	b 12:34 r	no
      
      In this case, the access didn't match any of the exceptions in the
      cgroup, which is required since exceptions will disallow access.
      
      To solve this problem, two new functions were created to match an
      exception either fully or partially. In the example above, a partial
      check will be performed and it'll produce a match since at least
      "b 12:34 r" from "b 12:34 rw" access matches.
      
      Cc: cgroups@vger.kernel.org
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: stable@vger.kernel.org
      Signed-off-by: 's avatarAristeu Rozanski <arozansk@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      79d71974
  12. 19 Mar, 2014 1 commit
    • Tejun Heo's avatar
      cgroup: drop const from @buffer of cftype->write_string() · 4d3bb511
      Tejun Heo authored
      cftype->write_string() just passes on the writeable buffer from kernfs
      and there's no reason to add const restriction on the buffer.  The
      only thing const achieves is unnecessarily complicating parsing of the
      buffer.  Drop const from @buffer.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
      Cc: Paul Mackerras <paulus@samba.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Arnaldo Carvalho de Melo <acme@ghostprotocols.net>
      Cc: Daniel Borkmann <dborkman@redhat.com>
      Cc: Michal Hocko <mhocko@suse.cz>                                           
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      4d3bb511
  13. 08 Feb, 2014 1 commit
    • Tejun Heo's avatar
      cgroup: clean up cgroup_subsys names and initialization · 073219e9
      Tejun Heo authored
      cgroup_subsys is a bit messier than it needs to be.
      
      * The name of a subsys can be different from its internal identifier
        defined in cgroup_subsys.h.  Most subsystems use the matching name
        but three - cpu, memory and perf_event - use different ones.
      
      * cgroup_subsys_id enums are postfixed with _subsys_id and each
        cgroup_subsys is postfixed with _subsys.  cgroup.h is widely
        included throughout various subsystems, it doesn't and shouldn't
        have claim on such generic names which don't have any qualifier
        indicating that they belong to cgroup.
      
      * cgroup_subsys->subsys_id should always equal the matching
        cgroup_subsys_id enum; however, we require each controller to
        initialize it and then BUG if they don't match, which is a bit
        silly.
      
      This patch cleans up cgroup_subsys names and initialization by doing
      the followings.
      
      * cgroup_subsys_id enums are now postfixed with _cgrp_id, and each
        cgroup_subsys with _cgrp_subsys.
      
      * With the above, renaming subsys identifiers to match the userland
        visible names doesn't cause any naming conflicts.  All non-matching
        identifiers are renamed to match the official names.
      
        cpu_cgroup -> cpu
        mem_cgroup -> memory
        perf -> perf_event
      
      * controllers no longer need to initialize ->subsys_id and ->name.
        They're generated in cgroup core and set automatically during boot.
      
      * Redundant cgroup_subsys declarations removed.
      
      * While updating BUG_ON()s in cgroup_init_early(), convert them to
        WARN()s.  BUGging that early during boot is stupid - the kernel
        can't print anything, even through serial console and the trap
        handler doesn't even link stack frame properly for back-tracing.
      
      This patch doesn't introduce any behavior changes.
      
      v2: Rebased on top of fe1217c4 ("net: net_cls: move cgroupfs
          classid handling into core").
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarNeil Horman <nhorman@tuxdriver.com>
      Acked-by: 's avatar"David S. Miller" <davem@davemloft.net>
      Acked-by: 's avatar"Rafael J. Wysocki" <rjw@rjwysocki.net>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: 's avatarPeter Zijlstra <peterz@infradead.org>
      Acked-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Acked-by: 's avatarIngo Molnar <mingo@redhat.com>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Serge E. Hallyn <serue@us.ibm.com>
      Cc: Vivek Goyal <vgoyal@redhat.com>
      Cc: Thomas Graf <tgraf@suug.ch>
      073219e9
  14. 05 Dec, 2013 1 commit
    • Tejun Heo's avatar
      cgroup: replace cftype->read_seq_string() with cftype->seq_show() · 2da8ca82
      Tejun Heo authored
      In preparation of conversion to kernfs, cgroup file handling is
      updated so that it can be easily mapped to kernfs.  This patch
      replaces cftype->read_seq_string() with cftype->seq_show() which is
      not limited to single_open() operation and will map directcly to
      kernfs seq_file interface.
      
      The conversions are mechanical.  As ->seq_show() doesn't have @css and
      @cft, the functions which make use of them are converted to use
      seq_css() and seq_cft() respectively.  In several occassions, e.f. if
      it has seq_string in its name, the function name is updated to fit the
      new method better.
      
      This patch does not introduce any behavior changes.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarAristeu Rozanski <arozansk@redhat.com>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: 's avatarDaniel Wagner <daniel.wagner@bmw-carit.de>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
      Cc: Neil Horman <nhorman@tuxdriver.com>
      2da8ca82
  15. 24 Oct, 2013 1 commit
  16. 09 Aug, 2013 7 commits
    • Tejun Heo's avatar
      cgroup: make css_for_each_descendant() and friends include the origin css in the iteration · bd8815a6
      Tejun Heo authored
      Previously, all css descendant iterators didn't include the origin
      (root of subtree) css in the iteration.  The reasons were maintaining
      consistency with css_for_each_child() and that at the time of
      introduction more use cases needed skipping the origin anyway;
      however, given that css_is_descendant() considers self to be a
      descendant, omitting the origin css has become more confusing and
      looking at the accumulated use cases rather clearly indicates that
      including origin would result in simpler code overall.
      
      While this is a change which can easily lead to subtle bugs, cgroup
      API including the iterators has recently gone through major
      restructuring and no out-of-tree changes will be applicable without
      adjustments making this a relatively acceptable opportunity for this
      type of change.
      
      The conversions are mostly straight-forward.  If the iteration block
      had explicit origin handling before or after, it's moved inside the
      iteration.  If not, if (pos == origin) continue; is added.  Some
      conversions add extra reference get/put around origin handling by
      consolidating origin handling and the rest.  While the extra ref
      operations aren't strictly necessary, this shouldn't cause any
      noticeable difference.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      bd8815a6
    • Tejun Heo's avatar
      cgroup: make hierarchy iterators deal with cgroup_subsys_state instead of cgroup · 492eb21b
      Tejun Heo authored
      cgroup is currently in the process of transitioning to using css
      (cgroup_subsys_state) as the primary handle instead of cgroup in
      subsystem API.  For hierarchy iterators, this is beneficial because
      
      * In most cases, css is the only thing subsystems care about anyway.
      
      * On the planned unified hierarchy, iterations for different
        subsystems will need to skip over different subtrees of the
        hierarchy depending on which subsystems are enabled on each cgroup.
        Passing around css makes it unnecessary to explicitly specify the
        subsystem in question as css is intersection between cgroup and
        subsystem
      
      * For the planned unified hierarchy, css's would need to be created
        and destroyed dynamically independent from cgroup hierarchy.  Having
        cgroup core manage css iteration makes enforcing deref rules a lot
        easier.
      
      Most subsystem conversions are straight-forward.  Noteworthy changes
      are
      
      * blkio: cgroup_to_blkcg() is no longer used.  Removed.
      
      * freezer: cgroup_freezer() is no longer used.  Removed.
      
      * devices: cgroup_to_devcgroup() is no longer used.  Removed.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      492eb21b
    • Tejun Heo's avatar
      cgroup: pass around cgroup_subsys_state instead of cgroup in file methods · 182446d0
      Tejun Heo authored
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup.
      Please see the previous commit which converts the subsystem methods
      for rationale.
      
      This patch converts all cftype file operations to take @css instead of
      @cgroup.  cftypes for the cgroup core files don't have their subsytem
      pointer set.  These will automatically use the dummy_css added by the
      previous patch and can be converted the same way.
      
      Most subsystem conversions are straight forwards but there are some
      interesting ones.
      
      * freezer: update_if_frozen() is also converted to take @css instead
        of @cgroup for consistency.  This will make the code look simpler
        too once iterators are converted to use css.
      
      * memory/vmpressure: mem_cgroup_from_css() needs to be exported to
        vmpressure while mem_cgroup_from_cont() can be made static.
        Updated accordingly.
      
      * cpu: cgroup_tg() doesn't have any user left.  Removed.
      
      * cpuacct: cgroup_ca() doesn't have any user left.  Removed.
      
      * hugetlb: hugetlb_cgroup_form_cgroup() doesn't have any user left.
        Removed.
      
      * net_cls: cgrp_cls_state() doesn't have any user left.  Removed.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Acked-by: 's avatarDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      182446d0
    • Tejun Heo's avatar
      cgroup: pass around cgroup_subsys_state instead of cgroup in subsystem methods · eb95419b
      Tejun Heo authored
      cgroup is currently in the process of transitioning to using struct
      cgroup_subsys_state * as the primary handle instead of struct cgroup *
      in subsystem implementations for the following reasons.
      
      * With unified hierarchy, subsystems will be dynamically bound and
        unbound from cgroups and thus css's (cgroup_subsys_state) may be
        created and destroyed dynamically over the lifetime of a cgroup,
        which is different from the current state where all css's are
        allocated and destroyed together with the associated cgroup.  This
        in turn means that cgroup_css() should be synchronized and may
        return NULL, making it more cumbersome to use.
      
      * Differing levels of per-subsystem granularity in the unified
        hierarchy means that the task and descendant iterators should behave
        differently depending on the specific subsystem the iteration is
        being performed for.
      
      * In majority of the cases, subsystems only care about its part in the
        cgroup hierarchy - ie. the hierarchy of css's.  Subsystem methods
        often obtain the matching css pointer from the cgroup and don't
        bother with the cgroup pointer itself.  Passing around css fits
        much better.
      
      This patch converts all cgroup_subsys methods to take @css instead of
      @cgroup.  The conversions are mostly straight-forward.  A few
      noteworthy changes are
      
      * ->css_alloc() now takes css of the parent cgroup rather than the
        pointer to the new cgroup as the css for the new cgroup doesn't
        exist yet.  Knowing the parent css is enough for all the existing
        subsystems.
      
      * In kernel/cgroup.c::offline_css(), unnecessary open coded css
        dereference is replaced with local variable access.
      
      This patch shouldn't cause any behavior differences.
      
      v2: Unnecessary explicit cgrp->subsys[] deref in css_online() replaced
          with local variable @css as suggested by Li Zefan.
      
          Rebased on top of new for-3.12 which includes for-3.11-fixes so
          that ->css_free() invocation added by da0a12ca ("cgroup: fix a
          leak when percpu_ref_init() fails") is converted too.  Suggested
          by Li Zefan.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      Acked-by: 's avatarMichal Hocko <mhocko@suse.cz>
      Acked-by: 's avatarVivek Goyal <vgoyal@redhat.com>
      Acked-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Acked-by: 's avatarDaniel Wagner <daniel.wagner@bmw-carit.de>
      Cc: Peter Zijlstra <peterz@infradead.org>
      Cc: Ingo Molnar <mingo@redhat.com>
      Cc: Johannes Weiner <hannes@cmpxchg.org>
      Cc: Balbir Singh <bsingharora@gmail.com>
      Cc: Matt Helsley <matthltc@us.ibm.com>
      Cc: Jens Axboe <axboe@kernel.dk>
      Cc: Steven Rostedt <rostedt@goodmis.org>
      eb95419b
    • Tejun Heo's avatar
      cgroup: add css_parent() · 63876986
      Tejun Heo authored
      Currently, controllers have to explicitly follow the cgroup hierarchy
      to find the parent of a given css.  cgroup is moving towards using
      cgroup_subsys_state as the main controller interface construct, so
      let's provide a way to climb the hierarchy using just csses.
      
      This patch implements css_parent() which, given a css, returns its
      parent.  The function is guarnateed to valid non-NULL parent css as
      long as the target css is not at the top of the hierarchy.
      
      freezer, cpuset, cpu, cpuacct, hugetlb, memory, net_cls and devices
      are converted to use css_parent() instead of accessing cgroup->parent
      directly.
      
      * __parent_ca() is dropped from cpuacct and its usage is replaced with
        parent_ca().  The only difference between the two was NULL test on
        cgroup->parent which is now embedded in css_parent() making the
        distinction moot.  Note that eventually a css->parent field will be
        added to css and the NULL check in css_parent() will go away.
      
      This patch shouldn't cause any behavior differences.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      63876986
    • Tejun Heo's avatar
      cgroup: add/update accessors which obtain subsys specific data from css · a7c6d554
      Tejun Heo authored
      css (cgroup_subsys_state) is usually embedded in a subsys specific
      data structure.  Subsystems either use container_of() directly to cast
      from css to such data structure or has an accessor function wrapping
      such cast.  As cgroup as whole is moving towards using css as the main
      interface handle, add and update such accessors to ease dealing with
      css's.
      
      All accessors explicitly handle NULL input and return NULL in those
      cases.  While this looks like an extra branch in the code, as all
      controllers specific data structures have css as the first field, the
      casting doesn't involve any offsetting and the compiler can trivially
      optimize out the branch.
      
      * blkio, freezer, cpuset, cpu, cpuacct and net_cls didn't have such
        accessor.  Added.
      
      * memory, hugetlb and devices already had one but didn't explicitly
        handle NULL input.  Updated.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      a7c6d554
    • Tejun Heo's avatar
      cgroup: s/cgroup_subsys_state/cgroup_css/ s/task_subsys_state/task_css/ · 8af01f56
      Tejun Heo authored
      The names of the two struct cgroup_subsys_state accessors -
      cgroup_subsys_state() and task_subsys_state() - are somewhat awkward.
      The former clashes with the type name and the latter doesn't even
      indicate it's somehow related to cgroup.
      
      We're about to revamp large portion of cgroup API, so, let's rename
      them so that they're less awkward.  Most per-controller usages of the
      accessors are localized in accessor wrappers and given the amount of
      scheduled changes, this isn't gonna add any noticeable headache.
      
      Rename cgroup_subsys_state() to cgroup_css() and task_subsys_state()
      to task_css().  This patch is pure rename.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarLi Zefan <lizefan@huawei.com>
      8af01f56
  17. 24 May, 2013 1 commit
    • Tejun Heo's avatar
      device_cgroup: simplify cgroup tree walk in propagate_exception() · d591fb56
      Tejun Heo authored
      During a config change, propagate_exception() needs to traverse the
      subtree to update config on the subtree.  Because such config updates
      need to allocate memory, it couldn't directly use
      cgroup_for_each_descendant_pre() which required the whole iteration to
      be contained in a single RCU read critical section.  To work around
      the limitation, propagate_exception() built a linked list of
      descendant cgroups while read-locking RCU and then walked the list
      afterwards, which is safe as the whole iteration is protected by
      devcgroup_mutex.  This works but is cumbersome.
      
      With the recent updates, cgroup iterators now allow dropping RCU read
      lock while iteration is in progress making this workaround no longer
      necessary.  This patch replaces dev_cgroup->propagate_pending list and
      get_online_devcg() with direct cgroup_for_each_descendant_pre() walk.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Acked-by: 's avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Reviewed-by: 's avatarMichal Hocko <mhocko@suse.cz>
      d591fb56
  18. 18 Apr, 2013 1 commit
  19. 08 Apr, 2013 1 commit
  20. 20 Mar, 2013 4 commits
    • Aristeu Rozanski's avatar
      devcg: propagate local changes down the hierarchy · bd2953eb
      Aristeu Rozanski authored
      This patch makes exception changes to propagate down in hierarchy respecting
      when possible local exceptions.
      
      New exceptions allowing additional access to devices won't be propagated, but
      it'll be possible to add an exception to access all of part of the newly
      allowed device(s).
      
      New exceptions disallowing access to devices will be propagated down and the
      local group's exceptions will be revalidated for the new situation.
      Example:
            A
           / \
              B
      
          group        behavior          exceptions
          A            allow             "b 8:* rwm", "c 116:1 rw"
          B            deny              "c 1:3 rwm", "c 116:2 rwm", "b 3:* rwm"
      
      If a new exception is added to group A:
      	# echo "c 116:* r" > A/devices.deny
      it'll propagate down and after revalidating B's local exceptions, the exception
      "c 116:2 rwm" will be removed.
      
      In case parent's exceptions change and local exceptions are not allowed anymore,
      they'll be deleted.
      
      v7:
      - do not allow behavior change when the cgroup has children
      - update documentation
      
      v6: fixed issues pointed by Serge Hallyn
      - only copy parent's exceptions while propagating behavior if the local
        behavior is different
      - while propagating exceptions, do not clear and copy parent's: it'd be against
        the premise we don't propagate access to more devices
      
      v5: fixed issues pointed by Serge Hallyn
      - updated documentation
      - not propagating when an exception is written to devices.allow
      - when propagating a new behavior, clean the local exceptions list if they're
        for a different behavior
      
      v4: fixed issues pointed by Tejun Heo
      - separated function to walk the tree and collect valid propagation targets
      
      v3: fixed issues pointed by Tejun Heo
      - update documentation
      - move css_online/css_offline changes to a new patch
      - use cgroup_for_each_descendant_pre() instead of own descendant walk
      - move exception_copy rework to a separared patch
      - move exception_clean rework to a separated patch
      
      v2: fixed issues pointed by Tejun Heo
      - instead of keeping the local settings that won't apply anymore, remove them
      
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      bd2953eb
    • Aristeu Rozanski's avatar
      devcg: use css_online and css_offline · 1909554c
      Aristeu Rozanski authored
      Allocate resources and change behavior only when online. This is needed in
      order to determine if a node is suitable for hierarchy propagation or if it's
      being removed.
      
      Locking:
      Both functions take devcgroup_mutex to make changes to device_cgroup structure.
      Hierarchy propagation will also take devcgroup_mutex before walking the
      tree while walking the tree itself is protected by rcu lock.
      Acked-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      1909554c
    • Aristeu Rozanski's avatar
      devcg: prepare may_access() for hierarchy support · c39a2a30
      Aristeu Rozanski authored
      Currently may_access() is only able to verify if an exception is valid for the
      current cgroup, which has the same behavior. With hierarchy, it'll be also used
      to verify if a cgroup local exception is valid towards its cgroup parent, which
      might have different behavior.
      
      v2:
      - updated patch description
      - rebased on top of a new patch to expand the may_access() logic to make it
        more clear
      - fixed argument description order in may_access()
      Acked-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      c39a2a30
    • Aristeu Rozanski's avatar
      devcg: expand may_access() logic · 26898fdf
      Aristeu Rozanski authored
      In order to make the next patch more clear, expand may_access() logic.
      
      v2: may_access() returns bool now
      Acked-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarSerge Hallyn <serge.hallyn@canonical.com>
      Cc: Tejun Heo <tj@kernel.org>
      Cc: Serge Hallyn <serge.hallyn@canonical.com>
      Signed-off-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      26898fdf
  21. 22 Feb, 2013 1 commit
  22. 21 Jan, 2013 1 commit
    • Jerry Snitselaar's avatar
      security/device_cgroup: lock assert fails in dev_exception_clean() · 103a197c
      Jerry Snitselaar authored
      devcgroup_css_free() calls dev_exception_clean() without the devcgroup_mutex being locked.
      
      Shutting down a kvm virt was giving me the following trace:
      
      [36280.732764] ------------[ cut here ]------------
      [36280.732778] WARNING: at /home/snits/dev/linux/security/device_cgroup.c:172 dev_exception_clean+0xa9/0xc0()
      [36280.732782] Hardware name: Studio XPS 8100
      [36280.732785] Modules linked in: xt_REDIRECT fuse ebtable_nat ebtables ipt_MASQUERADE iptable_nat nf_nat_ipv4 nf_nat xt_CHECKSUM iptable_mangle bridge stp llc nf_conntrack_ipv4 ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 nf_defrag_ipv4 ip6table_filter it87 hwmon_vid xt_state nf_conntrack ip6_tables snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec snd_hwdep snd_seq coretemp snd_seq_device crc32c_intel snd_pcm snd_page_alloc snd_timer snd broadcom tg3 serio_raw i7core_edac edac_core ptp pps_core lpc_ich pcspkr mfd_core soundcore microcode i2c_i801 nfsd auth_rpcgss nfs_acl lockd vhost_net sunrpc tun macvtap macvlan kvm_intel kvm uinput binfmt_misc autofs4 usb_storage firewire_ohci firewire_core crc_itu_t radeon drm_kms_helper ttm
      [36280.732921] Pid: 933, comm: libvirtd Tainted: G        W    3.8.0-rc3-00307-g4c217de #1
      [36280.732922] Call Trace:
      [36280.732927]  [<ffffffff81044303>] warn_slowpath_common+0x93/0xc0
      [36280.732930]  [<ffffffff8104434a>] warn_slowpath_null+0x1a/0x20
      [36280.732932]  [<ffffffff812deaf9>] dev_exception_clean+0xa9/0xc0
      [36280.732934]  [<ffffffff812deb2a>] devcgroup_css_free+0x1a/0x30
      [36280.732938]  [<ffffffff810ccd76>] cgroup_diput+0x76/0x210
      [36280.732941]  [<ffffffff8119eac0>] d_delete+0x120/0x180
      [36280.732943]  [<ffffffff81195cff>] vfs_rmdir+0xef/0x130
      [36280.732945]  [<ffffffff81195e47>] do_rmdir+0x107/0x1c0
      [36280.732949]  [<ffffffff8132d17e>] ? trace_hardirqs_on_thunk+0x3a/0x3f
      [36280.732951]  [<ffffffff81198646>] sys_rmdir+0x16/0x20
      [36280.732954]  [<ffffffff8173bd82>] system_call_fastpath+0x16/0x1b
      [36280.732956] ---[ end trace ca39dced899a7d9f ]---
      Signed-off-by: 's avatarJerry Snitselaar <jerry.snitselaar@oracle.com>
      Cc: stable@kernel.org
      Signed-off-by: 's avatarJames Morris <james.l.morris@oracle.com>
      103a197c
  23. 19 Nov, 2012 1 commit
  24. 06 Nov, 2012 3 commits
    • Tejun Heo's avatar
      device_cgroup: add lockdep asserts · 4b1c7840
      Tejun Heo authored
      device_cgroup uses RCU safe ->exceptions list which is write-protected
      by devcgroup_mutex and has had some issues using locking correctly.
      Add lockdep asserts to utility functions so that future errors can be
      easily detected.
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      4b1c7840
    • Tejun Heo's avatar
      device_cgroup: fix RCU usage · 201e72ac
      Tejun Heo authored
      dev_cgroup->exceptions is protected with devcgroup_mutex for writes
      and RCU for reads; however, RCU usage isn't correct.
      
      * dev_exception_clean() doesn't use RCU variant of list_del() and
        kfree().  The function can race with may_access() and may_access()
        may end up dereferencing already freed memory.  Use list_del_rcu()
        and kfree_rcu() instead.
      
      * may_access() may be called only with RCU read locked but doesn't use
        RCU safe traversal over ->exceptions.  Use list_for_each_entry_rcu().
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      Acked-by: 's avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: stable@vger.kernel.org
      Cc: Aristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      201e72ac
    • Aristeu Rozanski's avatar
      device_cgroup: fix unchecked cgroup parent usage · 64e10477
      Aristeu Rozanski authored
      In 4cef7299 ("device_cgroup: add proper checking when changing
      default behavior") the cgroup parent usage is unchecked.  root will not
      have a parent and trying to use device.{allow,deny} will cause problems.
      For some reason my stressing scripts didn't test the root directory so I
      didn't catch it on my regular tests.
      Signed-off-by: 's avatarAristeu Rozanski <aris@redhat.com>
      Cc: Li Zefan <lizefan@huawei.com>
      Cc: James Morris <jmorris@namei.org>
      Cc: Pavel Emelyanov <xemul@openvz.org>
      Acked-by: 's avatarSerge E. Hallyn <serge.hallyn@ubuntu.com>
      Cc: Jiri Slaby <jslaby@suse.cz>
      Cc: Tejun Heo <tj@kernel.org>
      Signed-off-by: 's avatarTejun Heo <tj@kernel.org>
      64e10477
  25. 25 Oct, 2012 2 commits