I've just read Federico Kereki's article about interrogating a Linux
system titled “What's in the Box? Interrogate Your Hardware” in the December 2015 issue. I love this kind of article and hope to see more!
—
Brian Clark
Federico Kereki replies: Thanks, Mr Clark, for your kind words. The article grew out of my actual need to know about the hardware in my own machine, and because of Linux's openness, I learned even more than I had expected. I'm glad you liked my results!
Regarding Greg Bledsoe's “Server Hardening” article in the November 2015
issue: great article—lots of detailed help in one source. One question: is it possible to
mention specific logs you reference in the article? I get lost quickly when seeing the
large number of logs scattered about my Debian server.
—
Tom Browder
I believe that the final version of the script that Dave Taylor came up with in his Work the Shell column titled “Analyzing Comma-Separated Values (CSV) Files” in the December 2015 issue of Linux Journal contains an oversight. Specifically, it does not handle the case in which more than one field contains commas (for example, the dollar amount field and the comment field). I have modified Dave's script to take this into account. Hopefully, this will be of some help. I always enjoy Dave's column and have learned a lot from it. Here's the modified script:
#! /bin/bash - # fixcsv # fix CSV files with embedded commas # The problem is that some spreadsheet fields may contain commas. In the sample case, this includes the dollar amount and comment fields. I believe you overlooked the case in which both the dollar amount and comment fields contain commas. Your script assumes that there is at most one such instance. # The simplest solution is to export the spreadsheet contents with some field delimiter that can never appear in any field, e.g., a tab. Then write the script using this delimiter. # Original code # while read inline # do # if [ ! -z "$(echo $inline | grep \")" ] # then # f1=$(echo $inline | cut -d\" -f1) # f2=$(echo $inline | cut -d\" -f2) # f3=$(echo $inline | cut -d\" -f3) # echo $f1`echo $f2|sed 's/,//g'`$f3 # else # echo $inline # fi # done # exit 0 # This works correctly ONLY when there is EXACTLY ONE field enclosed in double quotes. # Revised code # For each line that contains at least one field enclosed in double quotes, process each such field from left to right until no fields are enclosed in double quotes and all remaining commas are field separators. The steps are: (1) replace the double quotes enclosing the field being processed with a temporary delimiter to isolate that specific field, (2) remove any commas embedded in the isolated field, (3) reconstruct the line without the temporary delimiters. The temporary delimiter must be a single character (for the cut command) that cannot appear in the input file. I selected an asterisk (*), but other characters can be used. Some characters (such as asterisk, colon, hyphen, and equals) work fine, while others (such as tab and semicolon) do not. td=* # temporary delimiter while read inline do while [ ! -z "$(echo $inline | grep \")" ] do inline=$(echo $inline | sed "s/\"/$td/" | sed "s/\"/$td/") f1=$(echo $inline | cut -d"$td" -f1) f2=$(echo $inline | cut -d"$td" -f2) f3=$(echo $inline | cut -d"$td" -f3) inline=$(echo "$f1$(echo $f2 | sed 's/,//g')$f3") done echo $inline done exit 0 # Test input file fixcsvtest.txt: $ cat fixcsvtest.txt 4/7/14,subscriptions,199.99,Ask Dave Taylor Monthly 4/10/14,subscriptions,"1,300.99",Linux Journal 4/10/14,subscriptions,"1,300.99","Linux Journal, APR 2014" 4/10/14,subscriptions,19.99,"Linux Journal, annual" ab,cd,ef,gh ab,cd,ef,"g,h" ab,cd,"e,f",gh ab,cd,"e,f","g,h" ab,"c,d",ef,gh ab,"c,d",ef,"g,h" ab,"c,d","e,f",gh ab,"c,d","e,f","g,h" "a,b",cd,ef,gh "a,b",cd,ef,"g,h" "a,b",cd,"e,f",gh "a,b",cd,"e,f","g,h" "a,b","c,d",ef,gh "a,b","c,d",ef,"g,h" "a,b","c,d","e,f",gh "a,b","c,d","e,f","g,h" $ # Test Results $ ./fixcsv < fixcsvtest.txt 4/7/14,subscriptions,199.99,Ask Dave Taylor Monthly 4/10/14,subscriptions,1300.99,Linux Journal 4/10/14,subscriptions,1300.99,Linux Journal APR 2014 4/10/14,subscriptions,19.99,Linux Journal annual ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh ab,cd,ef,gh $
Dave Taylor replies: Thanks for your note, Jeff, and I do believe you're correct that I didn't test the case where more than a single field of the input data included commas. Bah, pesky debugging! I like your mods, and yet still have a niggling sense that the entire problem can be sidestepped with the perfect regular expression. If I only had a few weeks to create it!
I thought you would like to see an unusual place where LJ is being read this month: 49
degrees north, 35 degrees west. That's the middle of the Atlantic Ocean at 20 knots
heading for NYC. The satellite Internet costs are rather steep on board so I brought a
few issues with me. Must dash as the sun is over the yard arm.
—
Roger Greenwood