Saturday, October 27, 2012

hackathon project: pfiles - postmortem analysis

During the first days of October 2012, we had an illumos day, and a one day illumos hackathon that was very well attended by some of the best and brightest in our community (and I daresay, in any open source community.)

The purpose of the hackathon, besides just doing some cool projects, is to give folks a chance to interact with other domain experts, and code, particularly in areas outside their particular area of expertise.

This year a number of interesting projects were worked on, and I think some of these are at the point of starting to bear fruit.  The project I undertook was the addition of post-mortem analysis support for pfiles(1).  That is, making it possible to use pfiles against a core(4) file.  Adam Leventhal suggested the project, and provided the initial suggestion on the method to use for doing the work.

This is a particularly interesting project, because the information that pfiles reports is located in kernel state (in the uarea) and has not traditionally been available in a core file.

To complete this project, I had to work in a few areas that were unfamiliar to me.  Especially I had never looked at the innards of ELF files, nor at the process by which a core file is generated.  To add the necessary information, I had to create a new type of ELF note, and then add code to generate it to libproc (used by gcore) and elfexec (for kernel driven core dumps).  I also had to add support to parse this to libproc (for pfiles), and elfdump.  The project turned out to be bigger than anticipated, consisting of almost a thousand lines of code.  So it took me a little more than a day to complete it.

A webrev with the associated changes is presented for your enjoyment.  I'd appreciate any review feedback, and especially any further testing.

Here's a sample run:




garrett@openindiana{250}> cat < /dev/zero &
[2] 144241
garrett@openindiana{251}> pfiles 144241
144241: cat
  Current rlimit: 256 file descriptors
   0: S_IFCHR mode:0666 dev:526,0 ino:35127324 uid:0 gid:3 rdev:67,12
      O_RDONLY|O_LARGEFILE
      /devices/pseudo/mm@0:zero
      offset:72712192
   1: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:72704245
   2: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:72704245
garrett@openindiana{252}> gcore 144241
gcore: core.144241 dumped
garrett@openindiana{253}> pfiles core.144241 
core 'core.144241' of 144241: cat
   0: S_IFCHR mode:0666 dev:526,0 ino:35127324 uid:0 gid:3 rdev:67,12
      O_RDONLY|O_LARGEFILE
      /devices/pseudo/mm@0:zero
      offset:195125248
   1: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:161882375
   2: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:161882375
garrett@openindiana{254}> kill -11 144241
garrett@openindiana{255}> pfiles core
core 'core' of 144241: cat
   0: S_IFCHR mode:0666 dev:526,0 ino:35127324 uid:0 gid:3 rdev:67,12
      O_RDONLY|O_LARGEFILE
      /devices/pseudo/mm@0:zero
      offset:419414016
   1: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:296960430
   2: S_IFCHR mode:0620 dev:527,0 ino:893072809 uid:101 gid:7 rdev:27,2
      O_RDWR|O_NOCTTY|O_LARGEFILE
      /dev/pts/2
      offset:296960430
[2]  - Segmentation fault            cat < /dev/zero (core dumped)

garrett@openindiana{256}> elfdump core
...


  entry [10]
    namesz: 0x5
    descsz: 0x440
    type:   [ NT_FDINFO ]
    name:
        CORE\0
    desc: (prfdinfo_t)
        pr_fd:        0
        pr_mode:      [ S_IFCHR S_IRUSR S_IWUSR S_IRGRP S_IWGRP S_IROTH S_IWOTH ]
        pr_uid:       0                  pr_gid:      3
        pr_major:     526                pr_minor:    0
        pr_rmajor:    67                 pr_rminor:   12
        pr_ino:       35127324
        pr_size:      0                  pr_offset:   419414016
        pr_fileflags: [ O_RDONLY O_LARGEFILE ]
        pr_fdflags:   0
        pr_path:      /devices/pseudo/mm@0:zero

  entry [11]
    namesz: 0x5
    descsz: 0x440
    type:   [ NT_FDINFO ]
    name:
        CORE\0
    desc: (prfdinfo_t)
        pr_fd:        1
        pr_mode:      [ S_IFCHR S_IRUSR S_IWUSR S_IWGRP ]
        pr_uid:       101                pr_gid:      7
        pr_major:     527                pr_minor:    0
        pr_rmajor:    27                 pr_rminor:   2
        pr_ino:       893072809
        pr_size:      0                  pr_offset:   296960430
        pr_fileflags: [ O_RDONLY O_NOCTTY O_LARGEFILE ]
        pr_fdflags:   0
        pr_path:      /dev/pts/2

  entry [12]
    namesz: 0x5
    descsz: 0x440
    type:   [ NT_FDINFO ]
    name:
        CORE\0
    desc: (prfdinfo_t)
        pr_fd:        2
        pr_mode:      [ S_IFCHR S_IRUSR S_IWUSR S_IWGRP ]
        pr_uid:       101                pr_gid:      7
        pr_major:     527                pr_minor:    0
        pr_rmajor:    27                 pr_rminor:   2
        pr_ino:       893072809
        pr_size:      0                  pr_offset:   296960430
        pr_fileflags: [ O_RDONLY O_NOCTTY O_LARGEFILE ]
        pr_fdflags:   0
        pr_path:      /dev/pts/2
...


No comments: