awk is a powerful text-processing language that excels at field-based data extraction and reporting. It reads input line by line, splitting each line into fields, making it ideal for processing CSV files, log files, and tabular data directly from the command line without writing a full script.
LinuxawkCLIText Processingawk '{print $1}' filename.txt
$1 is the first field, $2 the second, and $0 is the entire line. Extracting specific columns from command output like ps aux or ls -l is one of the most common daily uses.awk '{print $1, $3}' filename.txt
OFS="," to produce CSV output, or OFS="\t" for tab-separated values.awk -F',' '{print $2}' data.csv
-F flag sets the input field separator. This processes CSV files by splitting on commas. Use -F':' for /etc/passwd, -F'\t' for TSV, or any regex pattern as the delimiter.awk '{print $NF}' filename.txt
NF is the built-in variable holding the count of fields on the current line. Using $NF as a field index gives the last field regardless of how many columns the line has.awk '/ERROR/ {print}' logfile.txt
awk '$3 > 100 {print $1, $3}' data.txt
awk '{print NR, $0}' filename.txt
NR (Number of Records) is the current line counter starting at 1. Prepending it adds line numbers to output, useful for cross-referencing results with source files or debugging scripts.awk '{sum += $1} END {print "Total:", sum}' numbers.txt
END block runs after all lines are processed, printing the accumulated total. This one-liner sums an entire column of numbers in a single pass — much faster than importing into a spreadsheet.awk '/ERROR/ {count++} END {print count}' logfile.txt
grep pattern | wc -l for counting occurrences because it does the work in a single pass through the file.awk '{sum += $1; count++} END {print "Avg:", sum/count}' data.txt
awk '{print $1}' file to extract the first column of any whitespace-separated output.-F',' for CSV or -F':' for /etc/passwd-style files./ERROR/ {print}, to filter lines.BEGIN{} for setup and END{} for summary output like totals and averages.Use awk when you need to work with specific columns or perform calculations on field values. Use sed for simple line-by-line substitutions and deletions. awk is better for structured tabular data while sed excels at regex-based text transformations.
NR is the current line number (Number of Records). NF is the count of fields on the current line (Number of Fields). Using $NF as a field index accesses the last field on a line regardless of its position.
Yes, awk accepts multiple filenames and processes them sequentially. NR keeps incrementing across files. Use FNR instead if you need the line number to reset at the start of each new input file.