본문 바로가기

IT/Toybox

Toybox - strings contribution

/* 
written by kaspy (kaspyx@gmail.com)
*/ 


정말로 오랫만에 toybox 관련 포스팅을하는데

예전에 strings 라는 명령어를 기부(contribution) 했을때 소스코드입니다.

strigns 명령어는 바이너리에 존재하는 출력가능한 아스키코드등을 출력해주는 명령어입니다.

toybox 인프라에 맞게 코딩을 한후 toybox@lists.landley.net 으로 보내주면 됩니다.


  1. /*strings.c - print the strings of printable characters in files.
  2.  *
  3.  * Copyright 2014 Kyung-su Kim <kaspyx@gmail.com>
  4.  * Copyright 2014 Kyungwan Han <asura321@gmail.com>
  5.  *
  6.  * No Standard
  7.  * TODO: utf8 strings
  8.  * TODO: posix -t
  9.  
  10. USE_STRINGS(NEWTOY(strings, "an#=4<1fo", TOYFLAG_USR|TOYFLAG_BIN))
  11.  
  12. config STRINGS
  13.   bool "strings"
  14.   default n
  15.   help
  16.     usage: strings [-fo] [-n LEN] [FILE...]
  17.  
  18.     Display printable strings in a binary file
  19.  
  20.     -f  Precede strings with filenames
  21.     -n  At least LEN characters form a string (default 4)
  22.     -o  Precede strings with decimal offsets
  23. */
  24.  
  25. #define FOR_strings
  26. #include "toys.h"
  27.  
  28. GLOBALS(
  29.   long num;
  30. )
  31.  
  32. void do_strings(int fd, char *filename)
  33. {
  34.   int nread, i, wlen = TT.num, count = 0;
  35.   off_t offset = 0;
  36.   char *string = xzalloc(wlen + 1);
  37.  
  38.   for (;;) {
  39.     nread = read(fd, toybuf, sizeof(toybuf));
  40.     if (nread < 0) perror_msg("%s", filename);
  41.     if (nread < 1) break;
  42.     for (= 0; i < nread; i++, offset++) {
  43.       if (((toybuf[i] >= 32) && (toybuf[i] <= 126)) || (toybuf[i] == '\t')) {
  44.         if (count == wlen) fputc(toybuf[i], stdout);
  45.         else {
  46.           string[count++] = toybuf[i];
  47.           if (count == wlen) {
  48.             if (toys.optflags & FLAG_f) printf("%s: ", filename);
  49.             if (toys.optflags & FLAG_o)
  50.               printf("%7lld ",(long long)(offset - wlen));
  51.             printf("%s", string);
  52.           }
  53.         }
  54.       } else {
  55.         if (count == wlen) xputc('\n');
  56.         count = 0;
  57.       }
  58.     }
  59.   }
  60.   xclose(fd);
  61.   free(string);
  62. }
  63.  
  64. void strings_main(void)
  65. {
  66.   loopfiles(toys.optargs, do_strings);
  67. }

위의 코드는 rob이 다듬어준 코드인데, 원래 코드는 어디있는지 까먹어버렸네요 보내준 소스코드를 rob 에게로부터 간단한 소스코드 리뷰를 받고, 좀더 다듬어서 최종 commit이 되게됩니다.

아래는 rob으로부터 받은 code review 내용입니다.

나중에 UTF8 옵션에대한 처리가 필요하다고 하는군요

그리고 fencepost 오류가 있을거같았다고 했는데 아니었다는군요~!
(fencepost 에러?? -> http://egloos.zum.com/tools/v/18549)

보는바와같이 어렵지 않은 명령어도 많으니 리눅스 명령어 개발 및 Contribution에 관심이 많으신분은 한번 도전해보세요~!


 

Tweak help so -N is consistently using "LEN" isntead of sometimes "count".

 

Redo the read() part to catch errors in reading from the file. (I'd like to have a warning-only version of xread() but we'd still have to check for errors on return to break out of the file, so it wouldn't save us much. Have to think about it...)

 

This doesn't handle UTF8 strings. At some point in the future I may want to redo it so it does. (I guess 4 consecutive valid utf8 characters count as a hit, and then continue as long as the utf8 sequences are valid? Not sure how that would affect the false positive rate...)

 

Posix wants -t and hex output, this doesn't do that. Might also revisit for that. Accept (and ignore) -a.

 

Redo output so only the newline is using xputc(), and the other stuff is using printf() and fputc(). The rationale is to let the stdio buffer do its thing, rather than making a syscall for every byte. Hopefully if we have a short write stdio will notice and either retry or set the error flag, and then xputc() catches the error flag and the error case of nontransient output failure (disk full, pipe to a dead process, etc).

For the common case stdio should buffer the output and xputc flush it anyway.

 

trivial: I made wlen be equal to TT.len instead of TT.len-1, had the assignment postincrement, and checked for equality rather than greater than. (Having the constraint be TT.len-1 and the allocation be TT.len+1 looked like a fencepost error. It wasn't, but I didn't want other readers to have to work out why.)

 

Couple whitespace tweaks. Other than that, it was in pretty good shape.

 

Rob

_______________________________________________

Toybox mailing list

Toybox@lists.landley.net

http://lists.landley.net/listinfo.cgi/toybox-landley.net