1. 22 Dec, 2012 9 commits
  2. 21 Dec, 2012 31 commits
    • Mauro Carvalho Chehab's avatar
      [media] em28xx: add support for RC6 mode 0 on devices that support it · 0dae8839
      Mauro Carvalho Chehab authored
      Newer em28xx chipsets (em2874 and upper) are capable of supporting
      RC6 codes, on both mode 0 (command mode, 16 bits payload size, similar
      to RC5, also called "Philips mode") and mode 6a (OEM command mode,
      with offers a few alternatives with regards to the payload size).
      I don't have any mode 6a control ATM to test it, so, I opted to add
      support only to mode 0.
      After this patch, adding support to mode 6a should not be hard.
      Tested with a Philips television remote controller.
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      0dae8839
    • Mauro Carvalho Chehab's avatar
      [media] em28xx: add support for NEC proto variants on em2874 and upper · 105e3687
      Mauro Carvalho Chehab authored
      By disabling the NEC parity check, it is possible to handle all 3 NEC
      protocol variants (32, 24 or 16 bits).
      Change the driver in order to handle all of them.
      Unfortunately, em2860/em2863 provide only 16 bits for the IR scancode,
      even when NEC parity is disabled. So, this change should affect only
      em2874 and newer devices, with provides up to 32 bits for the scancode.
      Tested with one NEC-16, one NEC-24 and one RC5 IR.
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      105e3687
    • Mauro Carvalho Chehab's avatar
      em28xx: add two missing tuners at the Kconfig file · 37285bf2
      Mauro Carvalho Chehab authored
      Those two tuners may also be needed.
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      37285bf2
    • Jonathan McDowell's avatar
      [media] Autoselect more relevant frontends for EM28XX DVB stick · fc09931e
      Jonathan McDowell authored
      I noticed that the EM28XX DVB driver doesn't auto select all of the
      appropriate DVB tuner modules required. In particular I needed
      DVB_LGDT3305 for my a340, but it looks like DVB_MT352 + DVB_S5H1409 were
      missing as well.
      Signed-Off-by: default avatarJonathan McDowell <noodles@earth.li>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      fc09931e
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ printks in lirc/igorplugusb.c · ce24c25b
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_warn(netdev, ... then dev_warn(dev, ... then pr_warn(...  to printk(KERN_WARNING ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      ce24c25b
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ or pr_ printks in lirc/lirc_imon.c · 5c77dc40
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      5c77dc40
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use pr_ printks in lirc/lirc_serial.c · df4f07b5
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_warn(netdev, ... then dev_warn(dev, ... then pr_warn(...  to printk(KERN_WARNING ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      df4f07b5
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use pr_ printks in lirc/lirc_parallel.c · cc38b8e9
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_warn(netdev, ... then dev_warn(dev, ... then pr_warn(...  to printk(KERN_WARNING ...
      - WARNING: Prefer netdev_notice(netdev, ... then dev_notice(dev, ... then pr_notice(...  to printk(KERN_NOTICE ...
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      cc38b8e9
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use pr_ printks in lirc/lirc_bt829.c · f8a7df00
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      f8a7df00
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use pr_ printks in lirc/lirc_sir.c · 014f0066
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      014f0066
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ or pr_ printks in lirc/lirc_sasem.c · e174e6ca
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      - WARNING: Prefer netdev_warn(netdev, ... then dev_warn(dev, ... then pr_warn(...  to printk(KERN_WARNING ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      and add pr_fmt.
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      e174e6ca
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in solo6x10/p2m.c · e6da7661
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_warn(netdev, ... then dev_warn(dev, ... then pr_warn(...  to printk(KERN_WARNING ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      e6da7661
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in go7007/s2250-board.c · 5820debc
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      5820debc
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in go7007/wis-tw2804.c · b441d033
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_dbg(netdev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      b441d033
    • YAMANE Toshiaki's avatar
      [media] Staging/media: fixed spacing coding style in go7007/wis-uda1342.c · b91c78e3
      YAMANE Toshiaki authored
      fixed below checkpatch error.
      - ERROR: that open brace { should be on the previous line
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      b91c78e3
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in go7007/wis-uda1342.c · dd442d4d
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_dbg(netdev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      dd442d4d
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in go7007/go7007-v4l2.c · 34047bac
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      34047bac
    • YAMANE Toshiaki's avatar
      [media] Staging/media: Use dev_ printks in go7007/wis-tw9903.c · b2704e15
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_dbg(netdev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      b2704e15
    • YAMANE Toshiaki's avatar
      [media] Staging/media: fixed spacing coding style in go7007/wis-tw9903.c · 8ce21ecd
      YAMANE Toshiaki authored
      fixed below checkpatch error.
      - ERROR: that open brace { should be on the previous line
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      8ce21ecd
    • Julian Scheel's avatar
      [media] tm6000-dvb: Fix module unload · afca99a2
      Julian Scheel authored
      dvb_unregister_frontend has to be called before detach. Otherwise the
      unregister call will segfault. This made tm6000-dvb module unload unusable.
      Signed-off-by: default avatarJulian Scheel <julian@jusst.de>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      afca99a2
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ or pr_ printks in go7007/go7007-i2c.c · afc2e8a0
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_dbg(netdev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      afc2e8a0
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ printks in go7007/s2250-loader.c · b11558a3
      YAMANE Toshiaki authored
      fixed below checkpatch warnings.
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      b11558a3
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ printks in go7007/wis-sony-tuner.c · 6c629edc
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      - WARNING: Prefer netdev_dbg(netdev, ... then dev_dbg(dev, ... then pr_debug(...  to printk(KERN_DEBUG ...
      - WARNING: Prefer netdev_err(netdev, ... then dev_err(dev, ... then pr_err(...  to printk(KERN_ERR ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      6c629edc
    • YAMANE Toshiaki's avatar
      [media] staging/media: Use dev_ printks in go7007/go7007-driver.c · 6d569502
      YAMANE Toshiaki authored
      fixed below checkpatch warning.
      - WARNING: Prefer netdev_info(netdev, ... then dev_info(dev, ... then pr_info(...  to printk(KERN_INFO ...
      Signed-off-by: default avatarYAMANE Toshiaki <yamanetoshi@gmail.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      6d569502
    • Matthijs Kooijman's avatar
      [media] rc: Call rc_register_device before irq setup · 9fa35204
      Matthijs Kooijman authored
      This should fix a potential race condition, when the irq handler
      triggers while rc_register_device is still setting up the rdev->raw
      device.
      This crash has not been observed in practice, but there should be a very
      small window where it could occur. Since ir_raw_event_store_with_filter
      checks if rdev->raw is not NULL before using it, this bug is not
      triggered if the request_irq triggers a pending irq directly (since
      rdev->raw will still be NULL then).
      This commit was tested on nuvoton-cir only.
      
      Cc: Jarod Wilson <jarod@redhat.com>
      Cc: Maxim Levitsky <maximlevitsky@gmail.com>
      Cc: David Härdeman <david@hardeman.nu>
      Signed-off-by: default avatarMatthijs Kooijman <matthijs@stdin.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      9fa35204
    • Matthijs Kooijman's avatar
      [media] rc: Set rdev before irq setup · d62b6818
      Matthijs Kooijman authored
      This fixes a problem in fintek-cir and nuvoton-cir where the
      irq handler would trigger during module load before the rdev member was
      set, causing a NULL pointer crash.
      It seems this crash is very reproducible (just bombard the receiver with
      IR signals during module load), probably because when request_irq is
      called, any pending intterupt is handled immediately, before
      request_irq returns and rdev can be set.
      This same crash was supposed to be fixed by commit
      9ef449c6 ("[media] rc: Postpone ISR
      registration"), but the crash was still observed on the nuvoton-cir
      driver.
      This commit was tested on nuvoton-cir only.
      
      Cc: Jarod Wilson <jarod@redhat.com>
      Signed-off-by: default avatarMatthijs Kooijman <matthijs@stdin.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      d62b6818
    • Matthijs Kooijman's avatar
      [media] rc: Make probe cleanup goto labels more verbose · 70ef6991
      Matthijs Kooijman authored
      Before, labels were simply numbered. Now, the labels are named after the
      cleanup action they'll perform (first), based on how the winbond-cir
      driver does it. This makes the code a bit more clear and makes changes
      in the ordering of labels easier to review.
      This change is applied only to the rc drivers that do significant
      cleanup in their probe functions: ati-remote, ene-ir, fintek-cir,
      gpio-ir-recv, ite-cir, nuvoton-cir.
      This commit should not change any code, it just renames goto labels.
      
      [mchehab@redhat.com: removed changes at gpio-ir-recv.c, due to
       merge conflicts]
      Signed-off-by: default avatarMatthijs Kooijman <matthijs@stdin.nl>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      70ef6991
    • Kirill Smelkov's avatar
      [media] vivi: Optimize precalculate_line() · d40fbf8d
      Kirill Smelkov authored
      precalculate_line() is not very high on profile, but it calls expensive
      gen_twopix(), so let's polish it too:
          call gen_twopix() only once for every color bar and then distribute
          the result.
      before:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 46K of event 'cycles'
          # Event count (approx.): 15574200568
          #
          # Overhead          Command         Shared Object
          # ........  ...............  ....................
          #
              27.99%             rawv  libc-2.13.so          [.] __memcpy_ssse3
              23.29%           vivi-*  [kernel.kallsyms]     [k] memcpy
              10.30%             Xorg  [unknown]             [.] 0xa75c98f8
               5.34%           vivi-*  [vivi]                [k] gen_text.constprop.6
               4.61%             rawv  [vivi]                [k] gen_twopix
               2.64%             rawv  [vivi]                [k] precalculate_line
               1.37%          swapper  [kernel.kallsyms]     [k] read_hpet
      after:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 45K of event 'cycles'
          # Event count (approx.): 15561769214
          #
          # Overhead          Command         Shared Object
          # ........  ...............  ....................
          #
              30.73%             rawv  libc-2.13.so          [.] __memcpy_ssse3
              26.78%           vivi-*  [kernel.kallsyms]     [k] memcpy
              10.68%             Xorg  [unknown]             [.] 0xa73015e9
               5.55%           vivi-*  [vivi]                [k] gen_text.constprop.6
               1.36%          swapper  [kernel.kallsyms]     [k] read_hpet
               0.96%             Xorg  [kernel.kallsyms]     [k] read_hpet
               ...
               0.16%             rawv  [vivi]                [k] precalculate_line
               ...
               0.14%             rawv  [vivi]                [k] gen_twopix
      (i.e. gen_twopix and precalculate_line overheads are almost gone)
      Signed-off-by: default avatarKirill Smelkov <kirr@mns.spb.ru>
      Acked-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      d40fbf8d
    • Kirill Smelkov's avatar
      [media] vivi: Move computations out of vivi_fillbuf linecopy loop · 13908f33
      Kirill Smelkov authored
      The "dev->mvcount % wmax" thing was showing high in profiles (we do it
      for each line which ~ 500 per frame)
                 ?     000010c0 <vivi_fillbuff>:
                       ...
            0,39 ? 70:???mov    0x3ff4(%edi),%esi
            0,22 ? 76:?  mov    0x2a0(%edi),%eax
            0,30 ?    ?  mov    -0x84(%ebp),%ebx
            0,35 ?    ?  mov    %eax,%edx
            0,04 ?    ?  mov    -0x7c(%ebp),%ecx
            0,35 ?    ?  sar    $0x1f,%edx
            0,44 ?    ?  idivl  -0x7c(%ebp)
           21,68 ?    ?  imul   %esi,%ecx
            0,70 ?    ?  imul   %esi,%ebx
            0,52 ?    ?  add    -0x88(%ebp),%ebx
            1,65 ?    ?  mov    %ebx,%eax
            0,22 ?    ?  imul   %edx,%esi
            0,04 ?    ?  lea    0x3f4(%edi,%esi,1),%edx
            2,18 ?    ?? call   vivi_fillbuff+0xa6
            0,74 ?    ?  addl   $0x1,-0x80(%ebp)
           62,69 ?    ?  mov    -0x7c(%ebp),%edx
            1,18 ?    ?  mov    -0x80(%ebp),%ecx
            0,35 ?    ?  add    %edx,-0x84(%ebp)
            0,61 ?    ?  cmp    %ecx,-0x8c(%ebp)
            0,22 ?    ???jne    70
      so since all variables stay the same for all iterations let's move
      computations out of the loop: the abovementioned division and
      "width*pixelsize" too
      before:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 49K of event 'cycles'
          # Event count (approx.): 16475832370
          #
          # Overhead          Command           Shared Object
          # ........  ...............  ......................
          #
              29.07%             rawv  libc-2.13.so            [.] __memcpy_ssse3
              20.57%           vivi-*  [kernel.kallsyms]       [k] memcpy
              10.20%             Xorg  [unknown]               [.] 0xa7301494
               5.16%           vivi-*  [vivi]                  [k] gen_text.constprop.6
               4.43%             rawv  [vivi]                  [k] gen_twopix
               4.36%           vivi-*  [vivi]                  [k] vivi_fillbuff
               2.42%             rawv  [vivi]                  [k] precalculate_line
               1.33%          swapper  [kernel.kallsyms]       [k] read_hpet
      after:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 46K of event 'cycles'
          # Event count (approx.): 15574200568
          #
          # Overhead          Command         Shared Object
          # ........  ...............  ....................
          #
              27.99%             rawv  libc-2.13.so          [.] __memcpy_ssse3
              23.29%           vivi-*  [kernel.kallsyms]     [k] memcpy
              10.30%             Xorg  [unknown]             [.] 0xa75c98f8
               5.34%           vivi-*  [vivi]                [k] gen_text.constprop.6
               4.61%             rawv  [vivi]                [k] gen_twopix
               2.64%             rawv  [vivi]                [k] precalculate_line
               1.37%          swapper  [kernel.kallsyms]     [k] read_hpet
               0.79%             Xorg  [kernel.kallsyms]     [k] read_hpet
               0.64%             Xorg  [kernel.kallsyms]     [k] unix_poll
               0.45%             Xorg  [kernel.kallsyms]     [k] fget_light
               0.43%             rawv  libxcb.so.1.1.0       [.] 0x0000aae9
               0.40%            runsv  [kernel.kallsyms]     [k] ext2_try_to_allocate
               0.36%             Xorg  [kernel.kallsyms]     [k] _raw_spin_lock_irqsave
               0.31%           vivi-*  [vivi]                [k] vivi_fillbuff
      (i.e. vivi_fillbuff own overhead is almost gone)
      Signed-off-by: default avatarKirill Smelkov <kirr@mns.spb.ru>
      Acked-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      13908f33
    • Kirill Smelkov's avatar
      [media] vivi: vivi_dev->line[] was not aligned · 10ce8441
      Kirill Smelkov authored
      Though dev->line[] is u8 array we work with it as with u16, u24 or u32
      pixels, and also pass it to memcpy() and it's better to align it to at
      least 4.
      Before the patch, on x86 offsetof(vivi_dev, line) was 1003 and after
      patch it is 1004.
      There is slight performance increase, but I think is is slight, only
      because we start copying not from line[0]:
          ---- 8< ---- drivers/media/platform/vivi.c
          static void vivi_fillbuff(struct vivi_dev *dev, struct vivi_buffer *buf)
          {
                  ...
                  for (h = 0; h < hmax; h++)
                          memcpy(vbuf + h * wmax * dev->pixelsize,
                                 dev->line + (dev->mv_count % wmax) * dev->pixelsize,
                                 wmax * dev->pixelsize);
      before:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 49K of event 'cycles'
          # Event count (approx.): 16799780016
          #
          # Overhead          Command         Shared Object
          # ........  ...............  ....................
          #
              27.51%             rawv  libc-2.13.so          [.] __memcpy_ssse3
              23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
               9.96%             Xorg  [unknown]             [.] 0xa76f5e12
               4.94%           vivi-*  [vivi]                [k] gen_text.constprop.6
               4.44%             rawv  [vivi]                [k] gen_twopix
               3.17%           vivi-*  [vivi]                [k] vivi_fillbuff
               2.45%             rawv  [vivi]                [k] precalculate_line
               1.20%          swapper  [kernel.kallsyms]     [k] read_hpet
          23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
                           |
                           --- memcpy
                              |
                              |--99.28%-- vivi_fillbuff
                              |          vivi_thread
                              |          kthread
                              |          ret_from_kernel_thread
                               --0.72%-- [...]
      after:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          #
          # Samples: 49K of event 'cycles'
          # Event count (approx.): 16475832370
          #
          # Overhead          Command           Shared Object
          # ........  ...............  ......................
          #
              29.07%             rawv  libc-2.13.so            [.] __memcpy_ssse3
              20.57%           vivi-*  [kernel.kallsyms]       [k] memcpy
              10.20%             Xorg  [unknown]               [.] 0xa7301494
               5.16%           vivi-*  [vivi]                  [k] gen_text.constprop.6
               4.43%             rawv  [vivi]                  [k] gen_twopix
               4.36%           vivi-*  [vivi]                  [k] vivi_fillbuff
               2.42%             rawv  [vivi]                  [k] precalculate_line
               1.33%          swapper  [kernel.kallsyms]       [k] read_hpet
      Signed-off-by: default avatarKirill Smelkov <kirr@mns.spb.ru>
      Acked-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      10ce8441
    • Kirill Smelkov's avatar
      [media] vivi: Optimize gen_text() · e3a8b4d2
      Kirill Smelkov authored
      I've noticed that vivi takes a lot of CPU to produce its frames.
      For example for 8 devices and 8 simple programs running, where each
      captures YUY2 640x480 and displays it to X via SDL, profile timing is as
      follows:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          # Samples: 82K of event 'cycles'
          # Event count (approx.): 31551930117
          #
          # Overhead          Command         Shared Object                                                           Symbol
          # ........  ...............  ....................
          #
              49.48%           vivi-*  [vivi]                [k] gen_twopix
              10.79%           vivi-*  [kernel.kallsyms]     [k] memcpy
              10.02%             rawv  libc-2.13.so          [.] __memcpy_ssse3
               8.35%           vivi-*  [vivi]                [k] gen_text.constprop.6
               5.06%             Xorg  [unknown]             [.] 0xa73015f8
               2.32%             rawv  [vivi]                [k] gen_twopix
               1.22%             rawv  [vivi]                [k] precalculate_line
               1.20%           vivi-*  [vivi]                [k] vivi_fillbuff
          (rawv is display program, vivi-* is a combination of vivi-000 through vivi-007)
      so a lot of time is spent in gen_twopix() which as the follwing
      call-graph profile shows ...
          49.48%           vivi-*  [vivi]                [k] gen_twopix
                           |
                           --- gen_twopix
                              |
                              |--96.30%-- gen_text.constprop.6
                              |          vivi_fillbuff
                              |          vivi_thread
                              |          kthread
                              |          ret_from_kernel_thread
                              |
                               --3.70%-- vivi_fillbuff
                                         vivi_thread
                                         kthread
                                         ret_from_kernel_thread
      ... is called mostly from gen_text().
      If we'll look at gen_text(), in the inner loop, we'll see
          if (chr & (1 << (7 - i)))
                  gen_twopix(dev, pos + j * dev->pixelsize, WHITE, (x+y) & 1);
          else
                  gen_twopix(dev, pos + j * dev->pixelsize, TEXT_BLACK, (x+y) & 1);
      which calls gen_twopix() for every character pixel, and that is very
      expensive, because gen_twopix() branches several times.
      Now, let's note, that we operate on only two colors - WHITE and
      TEXT_BLACK, and that pixel for that colors could be precomputed and
      gen_twopix() moved out of the inner loop. Also note, that for black
      and white colors even/odd does not make a difference for all supported
      pixel formats, so we could stop doing that `odd` gen_twopix() parameter
      game.
      So the first thing we are doing here is
          1) moving gen_twopix() calls out of gen_text() into vivi_fillbuff(),
             to pregenerate black and white colors, just before printing
             starts.
      what we have next is that gen_text's font rendering loop, even with
      gen_twopix() calls moved out, was inefficient and branchy, so let's
          2) rewrite gen_text() loop so it uses less variables + unroll char
             horizontal-rendering loop + instantiate 3 code paths for pixelsizes 2,3
             and 4 so that in all inner loops we don't have to branch or make
             indirections (*).
      Done all above reworks, for gen_text() we get nice, non-branchy
      streamlined code (showing loop for pixelsize=2):
                 ?       cmp    $0x2,%eax
                 ?     ? jne    26
                 ?       mov    -0x18(%ebp),%eax
                 ?       mov    -0x20(%ebp),%edi
                 ?       imul   -0x20(%ebp),%eax
                 ?       movzwl 0x3ffc(%ebx),%esi
            0,08 ?       movzwl 0x4000(%ebx),%ecx
            0,04 ?       add    %edi,%edi
                 ?       mov    0x0,%ebx
            0,51 ?       mov    %edi,-0x1c(%ebp)
                 ?       mov    %ebx,-0x14(%ebp)
                 ?       movl   $0x0,-0x10(%ebp)
                 ?       lea    0x20(%edx,%eax,2),%eax
                 ?       mov    %eax,-0x18(%ebp)
                 ?       xchg   %ax,%ax
            0,04 ? a0:   mov    0x8(%ebp),%ebx
                 ?       mov    -0x18(%ebp),%eax
            0,04 ?       movzbl (%ebx),%edx
            0,16 ?       test   %dl,%dl
            0,04 ?     ? je     128
            0,08 ?       lea    0x0(%esi),%esi
            1,61 ? b0:???shl    $0x4,%edx
            1,02 ?    ?  mov    -0x14(%ebp),%edi
            2,04 ?    ?  add    -0x10(%ebp),%edx
            2,24 ?    ?  lea    0x1(%ebx),%ebx
            0,27 ?    ?  movzbl (%edi,%edx,1),%edx
            9,92 ?    ?  mov    %esi,%edi
            0,39 ?    ?  test   %dl,%dl
            2,04 ?    ?  cmovns %ecx,%edi
            4,63 ?    ?  test   $0x40,%dl
            0,55 ?    ?  mov    %di,(%eax)
            3,76 ?    ?  mov    %esi,%edi
            0,71 ?    ?  cmove  %ecx,%edi
            3,41 ?    ?  test   $0x20,%dl
            0,75 ?    ?  mov    %di,0x2(%eax)
            2,43 ?    ?  mov    %esi,%edi
            0,59 ?    ?  cmove  %ecx,%edi
            4,59 ?    ?  test   $0x10,%dl
            0,67 ?    ?  mov    %di,0x4(%eax)
            2,55 ?    ?  mov    %esi,%edi
            0,78 ?    ?  cmove  %ecx,%edi
            4,31 ?    ?  test   $0x8,%dl
            0,67 ?    ?  mov    %di,0x6(%eax)
            5,76 ?    ?  mov    %esi,%edi
            1,80 ?    ?  cmove  %ecx,%edi
            4,20 ?    ?  test   $0x4,%dl
            0,86 ?    ?  mov    %di,0x8(%eax)
            2,98 ?    ?  mov    %esi,%edi
            1,37 ?    ?  cmove  %ecx,%edi
            4,67 ?    ?  test   $0x2,%dl
            0,20 ?    ?  mov    %di,0xa(%eax)
            2,78 ?    ?  mov    %esi,%edi
            0,75 ?    ?  cmove  %ecx,%edi
            3,92 ?    ?  and    $0x1,%edx
            0,75 ?    ?  mov    %esi,%edx
            2,59 ?    ?  mov    %di,0xc(%eax)
            0,59 ?    ?  cmove  %ecx,%edx
            3,10 ?    ?  mov    %dx,0xe(%eax)
            2,39 ?    ?  add    $0x10,%eax
            0,51 ?    ?  movzbl (%ebx),%edx
            2,86 ?    ?  test   %dl,%dl
            2,31 ?    ???jne    b0
            0,04 ?128:   addl   $0x1,-0x10(%ebp)
            4,00 ?       mov    -0x1c(%ebp),%eax
            0,04 ?       add    %eax,-0x18(%ebp)
            0,08 ?       cmpl   $0x10,-0x10(%ebp)
                 ?     ? jne    a0
      which almost goes away from the profile:
          # cmdline : /home/kirr/local/perf/bin/perf record -g -a sleep 20
          # Samples: 49K of event 'cycles'
          # Event count (approx.): 16799780016
          #
          # Overhead          Command         Shared Object                                                           Symbol
          # ........  ...............  ....................
          #
              27.51%             rawv  libc-2.13.so          [.] __memcpy_ssse3
              23.77%           vivi-*  [kernel.kallsyms]     [k] memcpy
               9.96%             Xorg  [unknown]             [.] 0xa76f5e12
               4.94%           vivi-*  [vivi]                [k] gen_text.constprop.6
               4.44%             rawv  [vivi]                [k] gen_twopix
               3.17%           vivi-*  [vivi]                [k] vivi_fillbuff
               2.45%             rawv  [vivi]                [k] precalculate_line
               1.20%          swapper  [kernel.kallsyms]     [k] read_hpet
      i.e. gen_twopix() overhead dropped from 49% to 4% and gen_text() loops
      from ~8% to ~4%, and overal cycles count dropped from 31551930117 to
      16799780016 which is ~1.9x whole workload speedup.
      (*) for RGB24 rendering I've introduced x24, which could be thought as
          synthetic u24 for simplifying the code. That's done because for
          memcpy used for conditional assignment, gcc generates suboptimal code
          with more indirections.
          Fortunately, in C struct assignment is builtin and that's all we
          need from pixeltype for font rendering.
      Signed-off-by: default avatarKirill Smelkov <kirr@mns.spb.ru>
      Acked-by: default avatarHans Verkuil <hans.verkuil@cisco.com>
      Signed-off-by: default avatarMauro Carvalho Chehab <mchehab@redhat.com>
      e3a8b4d2