> I've already read that grep is limited to a certain line length,
Not the GNU version. The manual specifically says that "it has no limits on input line length other than available memory".
Of course if you don't know something it's not normal. But you did learn something. You learned that \w
catches _
. You learned that before you posted anything here. If you googled this question, or read documentation for popular regex standards (btw, regex implementations sometimes differ slightly on meta-characters and what they mean), you could have easily verified that what you learned was correct.
So I wanted to assure you that not only is it correct but it's also normal.
BTW, tutorials are great, but I guess they can be incomplete or unclear. I'd really recommend also using a manual, which while maybe a less eli5-friendly will probably be more complete and encyclopedic. GNU grep has a pretty awesome regex manual: http://www.gnu.org/software/grep/manual/html_node/Regular-Expressions.html
Run "ls -ltr
" and the ones at the bottom of the list were most recently updated.
But most likely you want to be looking in messages
or syslog
.You'll likely be wanting to use <code>tail</code> and <code>grep</code>. If you want to pull the log file to some other computer to peruse it, check out <code>scp</code>.
I don't think there is any particular way to do this without it still going crazy slow. I remember reading about how GNU version of GREP uses the Boyer-Moore algorithm to avoid looking at every byte, and do minimal instructions with each byte that it looks at. Seeing that there is already a very well made FOSS tool to search for substrings in files (grep), why not try to use it?
What you are looking for sounds to be something like n-grams maybe something like this or of the sort could help. It's hard to say without knowing the use cases.
I just read some of your other replies and realized that I haven't been very helpful since you aren't programming a solution.