The official description of FPAT, from the GAWK manual, is at https://www.gnu.org/software/gawk/manual/html_node/Splitting-By-Content.html.
gawk -vRS='>Cluster' -F '\n' 'NF == 3 { printf "Cluster %s", $0 }'
Plain old awk (FreeBSD) doesn't want to play like that?!
Ohh... https://www.gnu.org/software/gawk/manual/html_node/gawk-split-records.html
From the GNU awk manual:
> There is an important difference between the two cases of ‘FS = " "’ (a single space) and ‘FS = "[ \t\n]+"’ (a regular expression matching one or more spaces, TABs, or newlines). For both values of FS, fields are separated by runs (multiple adjacent occurrences) of spaces, TABs, and/or newlines. However, when the value of FS is " ", awk first strips leading and trailing whitespace from the record and then decides where the fields are.
So all the other variants you are using, simply wont strip leading and trailing spaces.
Is this just a general question and 'cat -n' just an example or are you really trying to extract line numbers from the cat output?
if you just need to call external commands, you can use the system() function, as in: system("mv \"name1\" \"name2\"")
if you need to process (on your awk program) the output from a external program called by you, you can use getline() from a pipe, but I don't think your example needs the extra layer of complexity.
>I want to say that a lot of people in this field use AWK/grep/sed on a daily basis. Working with files consisting of hundreds of millions of lines is pretty common for us.
Does that mean your colleagues also use awk/sed/grep to parse the data? Just curious, can you say how large the files are in MB/GB? Since you're parsing the data, does that mean you're making charts/graphs?
You may like (gnuplot)[http://www.gnuplot.info/] as another command line tool.
Something like this:
https://asciinema.org/a/ixub1bqJWpJGeQLM7weD3nWWx
Use file
can achieve this:
https://asciinema.org/a/zYIf7ftK3bRrNtGX9kWatvNh6
I am just wondering whether I can avoid file
but just by native awk
.
I wouldn't recommend most of the tutorials or courses available on web, whether free or not, they're mostly shit, scratch the surface of the surface, and over-simplify things so that brain-dead readers can understand.
I'd recommend that you first read the 'Effective awk programming 2015 ' book and then when you have time, give the standard gnu awk manual at https://www.gnu.org/software/gawk/manual/gawk.html a go.
readfile() as a function that can work in any awk version, the gawk specific feature is that it's included by default with the language, and can imported with -i
https://www.gnu.org/software/gawk/manual/html_node/Readfile-Function.html
How would you find out about a film, a restaurant, the weather? Heard of Google yet??
Google search "Posix standard". It will tell you that Unix/Linux has a standarised set of functions (lots of systems have awk extensions, but the Posix standard defines a subset that should be portable anywhere).
Now google for posix awk, and go for the Open Group Library document. Shouldn't be hard -- it came up as all my top 3 searches.
Search for "examples". My Opera browser uses a keyboard short-cut Ctrl-F to open a search box (so does FireFox), and says there are 20 matches. Click on the > until you reach the EXAMPLES header. Then scroll (PageDown key) until you read example 13.
Actually, there is a bug in that example. You will get bonus points if you fix it. The bug is that if the input file is empty, it will crash out on a divide-by-zero.
You are not going to learn anything by bootlegging answers from the web, though. You need to grok it. (Another google there, then.) I can explain it for you, but I can't understand it for you.
I would suggest you open this magnificent document in your browser:
https://www.gnu.org/software/gawk/manual/gawk.html
and then work your way through all 19 Posix examples, looking up everything you don't understand.
It is worth it. Awk is one of the best, fastest, and most powerful Unix commands, and a very good first language to learn.
It may be hard to find but The AWK programming language is good, but from late 1980s:
If you really want a book on awk, Effective awk Programming has been incredibly useful as a handy reference (even though I only have the 3rd edition).