Log Parsing Cheat Sheet
1. GREP
GREP searches any given input files, selecting lines that match one or more patterns.
2. CUT
CUT cuts out selected portions of each line from each file and writes them to the standard output.
3. SED
SED reads the specified files, modifying the input as specified by a list of commands.
4. AWK
AWK scans each input file for lines that match any of a set of patterns.
5. SORT
SORT sorts text and binary files by lines.
6. UNIQ
UNIQ reads the specified input file comparing adjacent lines and writes a copy of each unique input line to the output file.
Let’s walk through an example.
To count the number of hits from the top 10 IP addresses requesting the path "/api/payments" from the access log in this common log format:
216.67.1.91 - leon [01/Jul/2002:12:11:52 +0000] "GET /index.html HTTP/1.1" 200 431
We can use a combination of grep, cut, sort, and uniq commands. Here is a sample command:
grep '/api/payments' access.log | cut -d ' ' -f 1 | sort | uniq -c | sort -rn | head -10
Here's what each part of the command does:
- grep '/api/payments' access.log: This filters the lines containing "/api/payments" from the access.log file.
- cut -d ' ' -f 1: This extracts the first field (the IP address) from each line. The -d ' ' option specifies space as the field delimiter.
- sort: This sorts the IP addresses.
- uniq -c: This removes duplicate lines and prefixes lines by the number of occurrences.
- sort -rn: This sorts the lines in reverse order (highest first) numerically.
- head -10: This shows only the first 10 lines of the output, which correspond to the top 10 IP addresses.
###
Log parsing commands are essential for analyzing logs and extracting useful information, especially in system administration and troubleshooting. Below are some of the most commonly used log parsing commands along with examples.
### 1. **grep**
`grep` is used to search for specific patterns in a file or output.
- **Example: Search for errors in a log file**
```bash
grep "ERROR" /var/log/syslog
```
This command will search for the word "ERROR" in the syslog file and return all matching lines.
- **Example: Case-insensitive search**
```bash
grep -i "error" /var/log/syslog
```
This will return lines with "error", "Error", "ERROR", etc.
### 2. **awk**
`awk` is a powerful text-processing tool that allows manipulation and extraction of data based on patterns and actions.
- **Example: Extract the date and message from a log**
```bash
awk '{print $1, $2, $3, $5}' /var/log/syslog
```
This extracts the first three fields (date and time) and the 5th field (log message) from each line.
- **Example: Filter logs by a specific user**
```bash
awk '$5 == "username"' /var/log/auth.log
```
This searches for entries where the 5th field equals "username".
### 3. **sed**
`sed` is used for stream editing, such as searching, finding, and replacing text in logs.
- **Example: Replace "ERROR" with "WARNING"**
```bash
sed 's/ERROR/WARNING/g' /var/log/syslog
```
This replaces all occurrences of "ERROR" with "WARNING" in the syslog file.
- **Example: Delete lines containing "DEBUG"**
```bash
sed '/DEBUG/d' /var/log/syslog
```
This deletes all lines containing "DEBUG" from the output.
### 4. **cut**
`cut` is used to extract specific columns or fields from a log file.
- **Example: Extract the timestamp from logs**
```bash
cut -d ' ' -f 1-3 /var/log/syslog
```
This extracts the first three fields (assumed to be date and time) from each line, where fields are separated by spaces.
### 5. **tail**
`tail` shows the last few lines of a file, which is helpful for viewing recent log entries.
- **Example: Show the last 10 lines of the log**
```bash
tail /var/log/syslog
```
- **Example: Continuously monitor new log entries (real-time)**
```bash
tail -f /var/log/syslog
```
### 6. **head**
`head` is the opposite of `tail`; it shows the first few lines of a file.
- **Example: View the first 10 lines of a log file**
```bash
head /var/log/syslog
```
### 7. **sort**
`sort` arranges the lines of a file or output in ascending or descending order.
- **Example: Sort log entries by timestamp**
```bash
sort /var/log/syslog
```
- **Example: Sort in reverse order**
```bash
sort -r /var/log/syslog
```
### 8. **uniq**
`uniq` filters out repeated lines, which is useful for finding unique log entries.
- **Example: Find unique IP addresses**
```bash
cut -d ' ' -f 7 /var/log/syslog | sort | uniq
```
This extracts the 7th field (assumed to be an IP address), sorts it, and removes duplicates.
### 9. **wc**
`wc` counts lines, words, or characters in a file.
- **Example: Count the number of log entries**
```bash
wc -l /var/log/syslog
```
- **Example: Count words in a log file**
```bash
wc -w /var/log/syslog
```
### 10. **less**
`less` is a pager command that allows you to view large log files one screen at a time.
- **Example: View a log file interactively**
```bash
less /var/log/syslog
```
### 11. **find**
`find` is used to search for files based on criteria, which can be helpful when looking for log files across directories.
- **Example: Find log files modified in the last 24 hours**
```bash
find /var/log -name "*.log" -mtime -1
```
### 12. **xargs**
`xargs` is used to build and execute commands based on the output of previous commands.
- **Example: Delete old log files**
```bash
find /var/log -name "*.log" -mtime +30 | xargs rm
```
This finds all log files older than 30 days and deletes them.
### 13. **logger**
`logger` is used to manually add entries to the system log.
- **Example: Log a custom message**
```bash
logger "This is a custom log entry"
```
---
### Use Case Example: Combined Commands for Log Parsing
To find unique IP addresses from recent logs:
```bash
tail -n 1000 /var/log/syslog | grep "Accepted" | awk '{print $NF}' | sort | uniq
```
This command looks at the last 1000 lines of the syslog, filters for "Accepted" SSH login messages, extracts the IP addresses, sorts them, and removes duplicates.
---
These commands, either individually or in combination, can be extremely powerful for parsing, analyzing, and managing logs in Unix/Linux environments.
No comments:
Post a Comment