Grep

TL;DR

# base search
grep 'pattern' path/to/search

# recursive search
grep -R 'pattern' path/to/search/recursively
grep -R --exclude-dir excluded/dir 'pattern' path/to/search/recursively   # gnu grep >= 2.5.2

# show line numbers
grep -n 'pattern' path/to/search

# parallel execution
# mind the files with spaces in their name
find . -type f | parallel -j 100% grep 'pattern'
find . -type f -print0 | xargs -0 -n 1 -P $(nproc) grep 'pattern'

Grep variants

egrep to use regular expressions in search patterns, same as grep -E
fgrep to use patterns as fixed strings, same as grep -F
archive-related variants for searching into compressed files
pdfgrep for searching into PDF files

xzgrep (with xzegrep and xzfgrep)
zstdgrep for zstd archives
many many others

PDFgrep

For simple searches, you might want to use [pdfgrep].

Should you need more advanced grep capabilities not incorporated by pdfgrep, you might want to convert the file to text and search there.
You can to this using pdftotext as shown in this example ([source][stackoverflow answer about how to search contents of multiple pdf files]):

find /path -name '*.pdf' -exec sh -c 'pdftotext "{}" - | grep --with-filename --label="{}" --color "your pattern"' ';'

Gotchas

Standard editions of grep run in a single thread; use another executor like parallel or xargs to parallelize grepping multiple files:
```
find . -type f | parallel -j 100% grep 'pattern'
find . -type f -print0 | xargs -0 -n 1 -P $(nproc) grep 'pattern'
```
mind files with spaces in their name.

Sources

Answer on StackOverflow about how to search contents of multiple pdf files
Regular expressions in grep with examples
Parallel grep

2.4 KiB Raw Blame History

Grep

TL;DR

Grep variants

Archive-related variants

PDFgrep

Gotchas

Further readings

Sources

2.4 KiB

Raw Blame History