Python/Networking/Security/Virtualization Fundamentals: September 2012

Wednesday, September 26, 2012

Netflow

Netflow

NetFlow is a network protocol developed by Cisco Systems for collecting IP traffic information. NetFlow has become an industry standard for traffic monitoring and is supported on various platforms.

Routers and switches that support NetFlow can collect IP traffic statistics on all interfaces where NetFlow is enabled, and later export those statistics as NetFlow records, toward at least one NetFlow collector - typically a server that does the actual traffic analysis.

Network Flows

A network flow can be defined in many ways. Cisco standard NetFlow version 5 defines a flow as a unidirectional sequence of packets that all share the following 7 values:

Ingress interface (SNMP ifIndex)

Source IP address

Destination IP address

IP protocol

Source port for UDP or TCP, 0 for other protocols

Destination port for UDP or TCP, type, and code for ICMP, or 0 for other protocols

IP Type of Service

Note that the Egress interface, IP Nexthop or BGP Next hops are not part of the key, and may not be accurate if the route changes before the expiration of the flow, or if load-balancing is done per-packet.

That definition of flows is also used for IPv6, and a similar definition is used for MPLS and Ethernet flows.

Export of NetFlow records

The router will output a flow record when it determines that the flow is finished. It does this by flow aging: when the router sees new traffic for an existing flow it resets the aging counter. Also, TCP session termination in a TCP flow causes the router to expire the flow. Routers can also be configured to output a flow record at a fixed interval even if the flow is still ongoing.

NetFlow Packet transport protocol

NetFlow records are traditionally exported using User Datagram Protocol (UDP) and collected using a NetFlow collector. The IP address of the NetFlow collector and the destination UDP port must be configured on the sending router. The standard value is UDP port 2055, but other values like 9555 or 9995 are often used.

For efficiency reasons, the router traditionally does not keep track of flow records already exported, so if a NetFlow packet is dropped due to network congestion or packet corruption, all contained records are lost forever. The UDP protocol does not inform the router of the loss so it can send the packets again. This can be a real problem, especially with NetFlow v8 or v9 that can aggregate a lot of packets or flows into a single record. A single UDP packet loss can cause a huge impact on the statistics of some flows.

That is why some modern implementations of NetFlow use the Stream Control Transmission Protocol (SCTP) to export packets so as to provide some protection against packet loss and make sure that NetFlow v9 templates are received before any related record is exported. Note that TCP would not be suitable for NetFlow because a strict ordering of packets would cause excessive buffering and delays.

The problem with SCTP is that it requires interaction between each NetFlow collector and each routers exporting NetFlow. There may be performance limitations if a router has to deal with many NetFlow collectors, and a NetFlow collector has to deal with lots of routers, especially when some of them are unavailable due to failure or maintenance.

SCTP may not be efficient if NetFlow must be exported toward several independent collectors, some of which may be test servers that can go down at any moment. UDP allows simple replication of NetFlow packets using Network taps or L2 or L3 Mirroring. Simple stateless equipments can also filter or change the destination address of NetFlow UDP packets if necessary. Since NetFlow export almost only use network backbone links, packet loss will often be negligible. If it happens, it will mostly be on the link between the network and the NetFlow collectors.

NetFlow Packet header

All NetFlow packets begin with version-dependant header, that contains at least these fields:

Version number (v5, v8, v9, v10)

Sequence number to detect loss and duplication

Timestamps at the moment of export, as system uptime or absolute time.

Number of records (v5 or v8) or list of templates and records (v9)

NetFlow Record

A NetFlow record can contain a wide variety of information about the traffic in a given flow.

NetFlow version 5 (one of the most commonly used versions, followed by version 9) contains the following:

Input interface index used by SNMP (ifIndex in IF-MIB).

Output interface index or zero if the packet is dropped.

Timestamps for the flow start and finish time, in milliseconds since the last boot.

Number of bytes and packets observed in the flow

Layer 3 headers:

Source & destination IP addresses

Source and destination port numbers for TCP,UDP, SCTP

ICMP Type and Code.

IP protocol

Type of Service (ToS) value

For TCP flows, the union of all TCP flags observed over the life of the flow.

Layer 3 Routing information:

IP address of the immediate next-hop (not the BGP nexthop) along the route to the destination

Source & destination IP masks (prefix lengths in the CIDR notation)

For ICMP flows, the Source Port is zero, and the Destination Port number field codes ICMP message Type and Code (port = ICMP-Type * 256 + ICMP-Code).

The source and destination Autonomous System (AS) number fields can report the destination AS (last AS of AS-Path) or the immediate neighbor AS (first AS of AS-Path). depending on the router configuration. But the AS number will be zero if the feature is not supported, the route is unknown or not announced by BGP, or the AS is the local AS. There is no explicit way to distinguish between these cases.

NetFlow version 9 can include all of these fields and can optionally include additional information such as Multiprotocol Label Switching (MPLS) labels and IPv6 addresses and ports,

By analyzing flow data, a picture of traffic flow and traffic volume in a network can be built. The NetFlow record format has evolved over time, hence the inclusion of version numbers. Cisco maintains details of the different version numbers and the layout of the packets for each version

NetFlow interfaces

NetFlow is usually enabled on a per-interface basis to limit load on the router components involved in NetFlow, or to limit the amount of NetFlow records exported.

NetFlow usually capture all packets received by an ingress IP interface, but some NetFlow implementations use IP filters to decide if a packet can be observed by NetFlow.

Some NetFlow implementations also allow the observation of packets on the egress IP interface, but this must be used with care: all flows from any ingress interface with NetFlow enabled to any interface with NetFlow enabled could be counted twice.

Sampled NetFlow

Standard NetFlow was designed to process all IP packets on an interface. But in some environments, e.g. on Internet backbones, that was too costly, due to the extra processing required for each packet, and large number of simultaneous flows.So Cisco introduced sampled NetFlow on Cisco 12000, and that is now used in all high-end routers that implement NetFlow.

Only one packet out of n is processed, where n, the sampling rate, is determined by the router configuration.

The exact selection process depends on the implementation:

One packet every n packet, in Deterministic NetFlow, as used on Cisco's 12000.

One packet randomly selected in an interval of n packet, in Random Sampled NetFlow, used on modern Cisco routers.

Some implementations have more complex methods to sample packets, like per-flow sampling on Cisco Catalysts.

The sampling rate is often the same for all interfaces, but can be adjusted per interface for some routers. When Sampled NetFlow is used, the NetFlow records must be adjusted for the effect of sampling - traffic volumes, in particular, are now an estimate rather than the actual measured flow volume.

The sampling rate is indicated in a header field of NetFlow version 5 (same sampling rate for all interfaces) or in option records of NetFlow version 9 (sampling rate per interface)

NetFlow equivalents

Many vendors other than Cisco provide an equivalent technology on their routers and switches, but some use a different name for the technology, probably because NetFlow is thought to be a Cisco trademark (even though as of March 2012 it is not listed in Cisco Trademarks[1]:

Jflow or cflowd for Juniper Networks

NetStream for 3Com/HP

NetStream for Huawei Technologies

Cflowd for Alcatel-Lucent

Rflow for Ericsson

AppFlow Citrix

sFlow for Allied Telesis

========

Cisco IOS NetFlow efficiently provides a key set of services for IP applications, including network traffic accounting, usage-based network billing, network planning, security, Denial of Service monitoring capabilities, and network monitoring. NetFlow provides valuable information about network users and

applications, peak usage times, and traffic routing.

NetFlow is an embedded instrumentation within Cisco IOS Software to characterize network operation. Visibility into the network is an indispensable tool for IT professionals. In response to new requirements and pressures, network operators are finding it critical to understand how the network is behaving including:

• Application and network usage

• Network productivity and utilization of network resources

• The impact of changes to the network

• Network anomaly and security vulnerabilities

• Long term compliance issues

Cisco IOS NetFlow fulfills those needs, creating an environment where administrators have the tools to understand who, what, when, where, and how network traffic is flowing. When the network behavior is understood, business process will improve and an audit trail of how the network is utilized is available. This increased awareness reduces vulnerability of the network as related to outage and allows efficient operation of the network. Improvements in network operation lower costs and drives higher business revenues by better utilization of the network infrastructure.

• NetFlow gives Network Managers a Detailed View of Application Flows on the Network http://www.cisco.com/en/US/prod/collateral/iosswrel/ps6537/ps6555/ps6601/prod_case_study0900aecd80311fc2.pdf

This white paper illustrates the importance of NetFlow and demonstrates how NetFlow can be used by Enterprises, Small and Medium-sized Businesses (SMBs), and Channel Partners to meet critical network challenges. It is a basic overview of how NetFlow works and produces data and reporting solutions.

Increasing Importance of Network Awareness

Traditional SNMP Performance Monitoring

Traditionally customers relied almost exclusively on Simple Network Management Protocol (SNMP) to monitor bandwidth. Although SNMP facilitates capacity planning, it does little to characterize traffic applications and patterns, essential for understanding how well the network supports the business. A more granular understanding of how bandwidth is being used is extremely important in IP networks today. Packet and byte interface counters are useful but understanding which IP addresses are the source and destination of traffic and which applications are generating the traffic is invaluable.

NetFlow Based Network Awareness

The ability to characterize IP traffic and understand how and where it flows is critical for network availability, performance and troubleshooting. Monitoring IP traffic flows facilitates more accurate capacity planning and ensures that resources are used appropriately in support of organizational goals. It helps IT determine where to apply Quality of Service (QoS), optimize resource usage and it plays a vital role in network security to detect Denial-of-Service (DoS) attacks, network-propagated worms, and other undesirable network events.

NetFlow facilitates solutions to many common problems encountered by IT professionals.

• Analyze new applications and their network impact

Identify new application network loads such as VoIP or remote site additions.

• Reduction in peak WAN traffic

Use NetFlow statistics to measure WAN traffic improvement from application-policy changes; understand who is utilizing the network and the network top talkers.

• Troubleshooting and understanding network pain points

Diagnose slow network performance, bandwidth hogs and bandwidth utilization quickly with command line interface or reporting tools.

• Detection of unauthorized WAN traffic

Avoid costly upgrades by identifying the applications causing congestion.

• Security and anomaly detection

NetFlow can be used for anomaly detection and worm diagnosis

CS-Mars.

• Validation of QoS parameters

Confirm that appropriate bandwidth has been allocated to each Class of Service (CoS) and that no CoS is over- or under-subscribed.

How does NetFlow give you network information?

What is an IP Flow?

Each packet that is forwarded within a router or switch is examined for a set of IP packet attributes. These attributes are the IP packet identity or fingerprint of the packet and determine if the packet is unique or similar to other packets.

Traditionally, an IP Flow is based on a set of 5 and up to 7 IP packet attributes.

IP Packet attributes used by NetFlow:

• IP source address

• IP destination address

• Source port

• Destination port

• Layer 3 protocol type

• Class of Service

• Router or switch interface

All packets with the same source/destination IP address, source/destination ports, protocol interface and class of service are grouped into a flow and then packets and bytes are tallied. This methodology of fingerprinting or determining a flow is scalable because a large amount of network information is condensed into a database of NetFlow information called the NetFlow cache.

Figure 1. Creating a flow in the NetFlow cache

This flow information is extremely useful for understanding network behavior

• Source address allows the understanding of who is originating the traffic

• Destination address tells who is receiving the traffic

• Ports characterize the application utilizing the traffic

• Class of service examines the priority of the traffic

• The device interface tells how traffic is being utilized by the network device

• Tallied packets and bytes show the amount of traffic

Additional information added to a flow includes

• Flow timestamps to understand the life of a flow; timestamps are useful for calculating packets and bytes per second

• Next hop IP addresses including BGP routing Autonomous Systems (AS)

• Subnet mask for the source and destination addresses to calculate prefixes

• TCP flags to examine TCP handshakes

How to Access the Data Produced by NetFlow?

There are two primary methods to access NetFlow data: the Command Line Interface (CLI) with show commands or utilizing an application reporting tool. If you are interested in an immediate view of what is happening in your network, the CLI can be used. NetFlow CLI is very useful for troubleshooting.

The other choice is to export NetFlow to a reporting server or what is called the "NetFlow collector". The NetFlow collector has the job of assembling and understanding the exported flows and combining or aggregating them to produce the valuable reports used for traffic and security analysis. NetFlow export, unlike SNMP polling, pushes information periodically to the NetFlow reporting collector. In general, the NetFlow cache is constantly filling with flows and software in the router or switch is searching the cache for flows that have terminated or expired and these flows are exported to the NetFlow collector server. Flows are terminated when the network communication has ended (ie: a packet contains the TCP FIN flag). The following steps are used to implement NetFlow data reporting:

• NetFlow is configured to capture flows to the NetFlow cache

• NetFlow export is configured to send flows to the collector

• The NetFlow cache is searched for flows that have terminated and these are exported to the NetFlow collector server

• Approximately 30 to 50 flows are bundled together and typically transported in UDP format to the NetFlow collector server

• The NetFlow collector software creates real-time or historical reports from the data

How Does the Router or Switch Determine Which Flows to Export to the NetFlow Collector Server?

A flow is ready for export when it is inactive for a certain time (ie: no new packets received for the flow); or if the flow is long-lived (active) and lasts greater than the active timer (ie: long FTP download). Also, the flow is ready for export when a TCP flag indicates the flow is terminated (i.e. FIN, RST flag). There are timers to determine if a flow is inactive or if a flow is long lived and the default for the inactive flow timer is 15 seconds and the active flow timer is 30 minutes. All the timers for export are configurable but the defaults are used in most cases except on the Cisco Catalyst 6500 Series Switch platform. The collector can combine flows and aggregate traffic. For example, an FTP download that lasts longer than the active timer may be broken into multiple flows and the collector can combine these flows showing total ftp traffic to a server at a specific time of day.

What is the Format of the Export Data?

There are various formats for the export packet and these are commonly called the export version. The export versions are well documented formats including version 5, 7, and 9. The most common format used is NetFlow export version 5 but version 9 is the latest format and has some advantages for key technologies such as security, traffic analysis and multicast. To understand more about export versions and to read a detailed technical discussion about NetFlow, please see the NetFlow Services and Solutions Guide: http://www.cisco.com/en/US/products/sw/netmgtsw/ps1964/products_implementation_design_guide09186a00800d6a11.html

Figure 2 below is an example of the data available in a NetFlow cache.

Figure 2. Example NetFlow Cache

Where Can NetFlow be Implemented in the Network?

NetFlow is typically used on a central site because all traffic from the remote sites is characterized and is available within NetFlow. The location where NetFlow is deployed may depend on the location of the reporting solution and the topology of the network. If the reporting collection server is centrally located, then implementing NetFlow close to the reporting collector server is optimal. NetFlow can also be enabled at remote branch locations with the understanding that the export data will utilize bandwidth. About 1-5% of the switched traffic is used for export to the collection server.

Figure 3. NetFlow export to a collector

Which Applications Report on NetFlow Data?

There are a large number of NetFlow collectors including Cisco, freeware and third party commercial vendors' products that report and utilize NetFlow data. It is important to understand various factors when picking a partner for NetFlow reporting.

• What will be the main uses for NetFlow? Security, capacity planning and traffic analysis including application and user monitoring?

• Is real-time reporting or historical reporting more important?

• Which operating system is preferred for the server?

• Is this a large or small implementation of NetFlow and is scalability a concern?

• How much are you willing to pay for the product?

• Are there any current performance management products used in your organization and can these be extended to support NetFlow?

Once the reporting application is chosen, the sizing of the server and number of servers are determined by talking with the vendor for the product. Some reporting systems offer a two-tier architecture, where collectors are placed near key sites in the network and they aggregate and forward the data to a main reporting server. Other smaller deployments may have a single server for reporting and collection.

Figure 4. Example of traffic analysis reporting utilizing a NetFlow data

Summary

NetFlow is an important technology available in your Cisco device to help you with visibility into how your network assets are being used and the network behavior. NetFlow will help reduce costs by giving you an audit trail, reduce troubleshooting time and facilitate reports to understand network utilization. It will help in the implementation of new IP applications and detect security vulnerabilities. NetFlow will let you understand who is using the network, the destination of traffic, when the network is utilized and the type of applications consuming bandwidth.

For more information on NetFlow visit http://www.cisco.com/go/netflow.

For detailed technical IOS documentation on NetFlow, go to: http://www.cisco.com/en/US/products/ps6601/prod_white_papers_list.html

Software Platform Configuration

The following is an example of a basic router configuration for NetFlow. NetFlow basic functionality is very easy to configure. NetFlow is configured on a per interface basis. When NetFlow is configured on the interface, IP packet flow information will be captured into the NetFlow cache. Also, the NetFlow data can be configured to export the NetFlow data to a collection server if a server is deployed.

1. Configuring the interface to capture flows into the NetFlow cache. CEF followed by NetFlow flow capture is configured on the interface

Router(config)# ip cef

Router(config)# interface ethernet 1/0 .

Router(config-if)# ip flow ingress

Router(config-if)# ip route-cache flow

Note: Either ip flow ingress or ip route-cache flow command can be used depending on the Cisco IOS Software version. Ip flow ingress is available in Cisco IOS Software Release 12.2(15)T or above.

2. This step is required if exporting the NetFlow cache to a reporting server. The version or format of the NetFlow export packet is chosen and then the destination IP address of the export server. The 9997 is the UDP port the server will use to receive the UDP export from the Cisco device.

Router(config)# ip flow-export version 9

Router(config)# ip flow-export destination 172.22.23.7 9997

More Information on NetFlow Configuration is available at: http://www.cisco.com/en/US/products/ps6601/prod_white_papers_list.html

Cisco Catalyst 6500 Series Switch Platform NetFlow Configuration

The following is an example of NetFlow on a Cisco Catalyst 6500 Series Switch. The Cisco Catalyst 6500 Series Switch has two aspects of NetFlow configuration, configuration of hardware based NetFlow and software NetFlow. Almost all flows on the Cisco Catalyst 6500 Series Switch are hardware switched and the MLS commands are used to characterize NetFlow in hardware. The MSFC (software based NetFlow) will characterize software based flows for packets that are punted up to the MSFC. Figure 8 shows the concept of two paths for NetFlow packets, the hardware and software paths and the configuration for each path. Normally on Cisco Catalyst 6500 Series Switch both hardware and software based NetFlow is configured.

Figure 6. NetFlow flow characterization on Cisco Catalyst 6500 Series Switch

The hardware switched flows use the MLS commands to configure NetFlow. Remember for hardware based flows NetFlow is enabled on all interfaces when configured.

mls aging normal 32 (Set aging of inactive flows to 32 seconds)

mls flow ip interface-full (Optionally configure a flow mask)

mls nde sender version 5 (Specify the version for export from the PFC)

mls nde interface (send interface information with the export, command available by default with Supervisor720/Supervisor 32)

The following is the configurations for NetFlow on the MSFC for software based flows. This configuration is equivalent to what is shown in Appendix A. The user configures NetFlow per interface to activate flow characterization and also configures an export destination for the hardware and software switched flows.

interface POS9/14

ip address 42.50.31.1 255.255.255.252

ip route-cache flow (also ip flow ingress can be used)

ip flow-export version 5 (The export version is setup for the software flows exported from the MSFC)

ip flow-export destination 10.1.1.209 9999 (The destination for hardware and software flows is specified).

More Information on the Cisco Catalyst 6500 Series Switch NetFlow Configuration can be viewed at: http://www.cisco.com/en/US/products/ps6601/prod_white_papers_list.html#anchor7

Example Show Commands for NetFlow Data

The following is an example of how to visualize the NetFlow data using the CLI. There are three methods to visualize the data depending on the version of Cisco IOS Software. The traditional show command for NetFlow is "show ip cache flow" also available are two forms of top talker commands. One of the top talkers commands uses a static configuration to view top talkers in the network and another command called dynamic top talkers allows real-time sorting and aggregation of NetFlow data. Also shown is a show MLS command to view the hardware cache on the Cisco Catalyst 6500 Series Switch.

The following is the original NetFlow show command used for many years in Cisco IOS Software. Information provided includes packet size distribution; basic statistics about number of flows and export timer setting, a view of the protocol distribution statistics and the NetFlow cache.

R3#show ip cache flow

IP packet size distribution (469 total packets):

1-32 64 96 128 160 192 224 256 288 320 352 384 416 448 480

.000 .968 .000 .031 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000

512 544 576 1024 1536 2048 2560 3072 3584 4096 4608

.000 .000 .000 .000 .000 .000 .000 .000 .000 .000 .000

IP Flow Switching Cache, 278544 bytes

7 active, 4089 inactive, 261 added

1278 ager polls, 0 flow alloc failures

Active flows timeout in 30 minutes

Inactive flows timeout in 15 seconds

IP Sub Flow Cache, 25736 bytes

1 active, 1023 inactive, 38 added, 38 added to flow

0 alloc failures, 0 force free

1 chunk, 1 chunk added

last clearing of statistics never

Protocol Total Flows Packets Bytes Packets Active(Sec) Idle(Sec)

-------- Flows /Sec /Flow /Pkt /Sec /Flow /Flow

TCP-WWW 71 0.0 1 40 0.1 1.3 1.2

TCP-BGP 35 0.0 1 40 0.0 1.3 1.2

TCP-other 108 0.1 1 40 0.1 1.3 1.2

UDP-other 37 0.0 1 52 0.0 0.0 15.4

ICMP 3 0.0 5 100 0.0 0.0 15.3

Total: 254 0.2 1 42 0.4 1.1 3.5

(NetFlow cache below)

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

Et1/0 172.16.7.2 Null 224.0.0.9 11 0208 0208 1

Et1/0 172.16.10.2 Et0/0 172.16.1.84 06 0087 0087 1

Et1/0 172.16.10.2 Et0/0 172.16.1.84 06 0050 0050 1

Et1/0 172.16.10.2 Et0/0 172.16.1.85 06 0089 0089 1

Et1/0 172.16.10.2 Et0/0 172.16.1.85 06 0050 0050 1

Et1/0 172.16.10.2 Et0/0 172.16.1.86 06 00B3 00B3 1

Et1/0 172.16.10.2 Et0/0 172.16.1.86 06 0185 0185 2

Table 4.

Field Description

bytes

Number of bytes of memory used by the NetFlow cache.

active

Number of active flows in the NetFlow cache at the time this command was entered.

inactive

Number of flow buffers that are allocated in the NetFlow cache, but were not currently assigned to a specific flow at the time this command was entered.

added

Number of flows created since the start of the summary period.

ager polls

Number of times the NetFlow code looked at the cache to cause entries to expire (used by Cisco for diagnostics only).

flow alloc failures

Number of times the NetFlow code tried to allocate a flow but could not.

exporting flows

IP address and User Datagram Protocol (UDP) port number of the workstation to which flows are

exported.

flows exported in udp datagrams

Total number of flows exported and the total number of UDP datagrams used to export the flows to the workstation.

failed

Number of flows that could not be exported by the router because of output interface limitations.

last clearing of statistics

Standard time output (hh:mm:ss) since the clear ip flow stats privileged EXEC command was executed. This time output changes to hours and days after the time exceeds 24 hours.

Protocol

IP protocol and the well-known port number. (Refer to http://www.iana.org, Protocol Assignment Number Services, for the latest RFC values.)

Note: Only a small subset of all protocols is displayed.

Total Flows

Number of flows in the cache for this protocol since the last time the statistics were cleared.

Flows/Sec

Average number of flows for this protocol per second; equal to the total flows divided by the number of seconds for this summary period.

Packets/Flow

Average number of packets for the flows for this protocol; equal to the total packets for this protocol divided by the number of flows for this protocol for this summary period.

Bytes/Pkt

Average number of bytes for the packets for this protocol; equal to the total bytes for this protocol divided by the total number of packets for this protocol for this summary period.

Packets/Sec

Average number of packets for this protocol per second; equal to the total packets for this protocol divided by the total number of seconds for this summary period.

Active(Sec)/Flow

Number of seconds from the first packet to the last packet of an expired flow divided by the number of total flows for this protocol for this summary period.

Idle(Sec)/Flow

Number of seconds observed from the last packet in each nonexpired flow for this protocol until the time at which the show ip cache verbose flow command was entered divided by the total number of flows for this protocol for this summary period.

show ip cache flow Field Descriptions in NetFlow Record Display

SrcIf

Interface on which the packet was received.

Port Msk AS

Source Border Gateway Protocol (BGP) autonomous system. This is always set to 0 in MPLS flows.

SrcIPaddress

IP address of the device that transmitted the packet.

DstIf

Interface from which the packet was transmitted.

Note: If an asterisk (*) immediately follows the DstIf field, the flow being shown is an egress flow.

Port Msk AS

Destination BGP autonomous system. This is always set to 0 in MPLS flows.

DstIPaddress

IP address of the destination device.

NextHop

Specifies the BGP next-hop address. This is always set to 0 in MPLS flows.

IP protocol well-known port number as described in RFC 1340, displayed in hexadecimal format.

B/Pk

Average number of bytes observed for the packets seen for this protocol (total bytes for this protocol or the total number of flows for this protocol for this summary period).

Flgs

TCP flags (result of bitwise OR of TCP flags from all packets in the flow).

Active

Number of active flows in the NetFlow cache at the time this command was entered.

Pkts

Number of packets switched through this flow.

More information on show ip cache flow is available at: http://www.cisco.com/en/US/docs/ios/12_2/switch/command/reference/xrfscmd5.html - wp1066187

The following command will show hardware based flow specifically on the Cisco Catalyst 6500 Series Switch platform. Also, the above command "show ip cache flow" can be used to show both hardware and software flows on the Cisco Catalyst 6500 Series Switch but this depends on the supervisor and release of Cisco IOS Software being used.

C6500#show mls netflow ip

Displaying Netflow entries in Supervisor Earl

DstIP SrcIP Prot:Src Port:DstPort Src i/f :AdjPtr Pkts Bytes Age LastSeen Attributes

---------------------------------------------------

10.102.130.213 10.214.39.79 tcp:46528 :www :0x0 7 3766 17 15:47:37 L3 - Dynamic

10.230.215.148 10.155.22.221 tcp:51813 :45912 :0x0 25 21329 47 15:47:39 L3 - Dynamic

10.97.36.200 10.17.64.177 tcp:65211 :www :0x0 9 7664 17 15:47:38 L3 - Dynamic

10.90.33.185 10.46.13.211 tcp:27077 :60425 :0x0 10 5734 17 15:47:38 L3 - Dynamic

<...>

The following describes the NetFlow Top Talkers command showing the largest packet and byte consumers on the network. Network Top Talkers does require some configuration. The configuration is shown followed by the show command. This command is available in Release 12.3(11)T and Release 12.2(25)S and above Cisco IOS Software releases.

Router(config)#ip flow-top-talkers

Router(config-flow-top-talkers)#top 10

The following is the 10 ten talkers in network sorted by packets:

R3#show ip flow top-talkers

SrcIf SrcIPaddress DstIf DstIPaddress Pr SrcP DstP Pkts

Et1/0 172.16.10.2 Et0/0 172.16.1.84 06 0087 0087 2100

Et1/0 172.16.10.2 Et0/0 172.16.1.85 06 0089 0089 1892

Et1/0 172.16.10.2 Et0/0 172.16.1.86 06 0185 0185 1762

Et1/0 172.16.10.2 Et0/0 172.16.1.86 06 00B3 00B3 2

Et1/0 172.16.10.2 Et0/0 172.16.1.84 06 0050 0050 1

Et1/0 172.16.10.2 Et0/0 172.16.1.85 06 0050 0050 1

7 of 10 top talkers shown. 7 flows processed.

More information on NetFlow MIB and Top Talkers can be found at: http://www.cisco.com/en/US/products/sw/iosswrel/ps1838/products_feature_guide09186a0080259533.html

The following command shows the output of the Dynamic Top Talkers command to show all flows to a specific destination address. This command was released in Release 12.4(4)T. This command is very useful to search the NetFlow cache in various methods and sorting by number of flows, packets or bytes. This command is very useful for troubleshooting and on the real-time security monitoring.

R3#show ip flow top 10 aggregate destination-address

There are 3 top talkers:

IPV4 DST-ADDR bytes pkts flows

=============== ========== ========== ==========

172.16.1.86 160 4 2

172.16.1.85 160 4 2

172.16.1.84 160 4 2

This following is an example of the Dynamic Top Talker command with the sorting of all flows to a specific destination on a port range

R3#show ip flow top 10 aggregate destination-address sorted-by bytes match source-port min 0 max 1000

There are 3 top talkers:

IPV4 DST-ADDR bytes pkts flows

=============== ========== ========== ==========

172.16.1.84 80 2 2

172.16.1.85 80 2 2

172.16.1.86 80 2 2

6 of 6 flows matched.

Other Examples include:

• Top 10 protocols currently flowing through the router:

router# show ip flow top 10 aggregate protocol

• Top 10 IP addresses which are sending the most packets:

router# show ip flow top 10 aggregate source-address sorted-by packets

• Top 5 destination addresses to which we're routing most traffic from the 10.0.0.1/24 prefix:

router# show ip flow top 5 aggregate destination-address match source-prefix 10.0.0.1/24

• 50 VLAN's which we're sending the least bytes to:

router# show ip flow top 50 aggregate destination-vlan sorted-by bytes ascending

• Top 20 sources of 1-packet flows:

router# show ip flow top 50 aggregate source-address match packets 1

More information on Dynamic Top Talkers can be found at:

=======
NetFlow Aggregation

By maintaining one or more extra flow caches, called aggregation caches, the NetFlow Aggregation

feature allows limited aggregation of NetFlow data export streams to be done on a router.

Aggregation Cache Schemes

The aggregation cache schemes are described in the following sections:

• Autonomous System Aggregation Scheme

• Destination Prefix Aggregation Scheme

• Prefix Aggregation Scheme

• Protocol Port Aggregation Scheme

• Source Prefix Aggregation Scheme

• Aggregation Scheme Fields and Key Fields

You can configure each aggregation cache with its individual cache size, cache ager timeout parameter,

export destination IP address, and export destination UDP port. As data flows expire in the main NetFlow

cache, the flows are added to each enabled aggregation cache. Each aggregation cache contains different

field combinations that determine which data flows are grouped. The default aggregation cache size is

4096.

Sampled NetFlow allows you to collect NetFlow statistics for a subset of incoming

(ingress) IPv4 traffic on the interface, selecting only one out of “N” sequential packets, where “N” is a

configurable parameter.

These sampling packets will substantially decrease the CPU utilization needed to account for NetFlow

packets by allowing the majority of the packets to be switched faster because they will not need to go

through additional NetFlow processing.

BGP Policy Accounting

Border Gateway Protocol (BGP) policy accounting measures and classifies IP traffic that is sent to, or

received from, different peers. Policy accounting is enabled on an input interface, and counters based on

parameters such as community list, autonomous system number, or autonomous system path are

assigned to identify the IP traffic.

Using the BGP table-map command, prefixes added to the routing table are classified by BGP attribute,

autonomous system number, or autonomous system path. Packet and byte counters are incremented per

input interface. A Cisco IOS policy-based classifier maps the traffic into one of eight possible buckets,

representing different traffic classes.

Benefits

Account for IP Traffic Differentially

BGP policy accounting classifies IP traffic by autonomous system number, autonomous system path, or

community list string, and increments packet and byte counters. Service providers can account for traffic

and apply billing, according to the route specific traffic traverses.

Efficient Network Circuit Peering and Transit Agreement Design

Implementing BGP policy accounting on an edge router can highlight potential design improvements for

peering and transit agreements.

If you specify both the source and destination addresses when configuring policy propagation based

on an access control list (ACL), the software looks up the source address in the routing table and

classifies the packet based on the source address first; then the software looks up the destination

address in the routing table and reclassifies the packet based on the destination address.

Sunday, September 9, 2012

Linux in brief!!

What Is an Operating System
In simple terms, an operating system is a manager. It manages all the available resources on a computer. These resources can be the hard disk, a printer, or the monitor screen. Even memory is a resource that needs to be managed. Within an operating system are the management functions that determine who gets to read data from the hard disk, what file is going to be printed next, what characters appear on the screen, and how much memory a certain program gets.
Another function of the operating system is to keep track of what each program is doing. That is, the operating system needs to keep track of whose program, or task is currently writing its file to the printer or which program needs to read a certain spot on the hard disk, etc. This is the concept of multi-users, as multiple users have access to the same resources

Processes
One basic concept of an operating system is the process. If we think of the program as the file stored on the hard disk or floppy and the process as that program in memory.
A process is more than just a program. Especially in a multi-user, multi-tasking operating system such as UNIX, there is much more to consider. Each program has a set of data that it uses to do what it needs. Often, this data is not part of the program. For example, if you are using a text editor, the file you are editing is not part of the program on disk but is part of the process in memory. If someone else were to be using the same editor, both of you would be using the same program. However, each of you would have a different process in memory.programs are read from hard-disk to become a process.

With the exception of the init process (PID 1) every process is the child of another process. In general, every process has the potential to be the parent of another process. Perhaps the program is coded in such a way that it will never start another process. However, this is a limitation of that program and not the operating system. A process has only one parent but may have many children

The processors used by Linux (Intel 80386 and later, as well as the DEC Alpha, and SPARC) have built-in capabilities to manage both multiple users and multiple tasks
In addition to user processes, such as shells, text editors, and databases, there are system processes running. These are processes that were started by the system. Several of these deal with managing memory and scheduling turns on the CPU. Others deal with delivering mail, printing, and other tasks that we take for granted. In principle, both of these kinds of processes are identical. However, system processes can run at much higher priorities and therefore run more often than user processes.
Typically a system process of this kind is referred to as a daemon process or background process because they run behind the scenes (i.e. in the background) without user intervention. It is also possible for a user to put one of his or her processes in the background. This is done by using the ampersand (&) metacharacter at the end of the command line.
What normally happens when you enter a command is that the shell will wait for that command to finish before it accepts a new command. By putting a command in the background, the shell does not wait, but rather is ready immediately for the next command. If you wanted, you could put the next command in the background as well.

I have talked to customers who have complained about their systems grinding to a halt after they put dozens of processes in the background. The misconception is that because they didn't see the process running, it must not be taking up any resources. (Out of sight, out of mind.) The issue here is that even though the process is running in the background and you can't see it, it still behaves like any other process.

Virtual Memory Basics
One interesting aspect about modern operating systems is the fact that they can run programs that require more memory than the system actually has.
At the extreme end, this means that if your CPU is 32-bit (meaning that it has registers that are 32-bits), you can access up to 232 bytes (that 4,294,967,296 or 4 billion). That means you would need 4 Gb of main memory (RAM) in order to to completely take advantage of this.
The interesting thing is that when you sum the memory requirements of the programs you are running,you often reach far beyond the physical memory you have. Currently my system appears to need about 570 Mb. although my machine only has 384 Mb. Surprisingly enough I don't notice any performance problems. So, how is this possible?

From the user's perspective the email program (or parts of the word processor) are loaded into memory. However, the system only loads what it needs. In some cases, they might all be in memory at once. However, if you load enough programs, you eventually reach a point where you have more programs than you have memory.
To solve this problem, Linux uses something called "virtual memory". It's virtual because it can use more than you actually have. In fact, with virtual memory you can use the whole 232 bytes. Basically, what this means is that you can run more programs at once without the need for buying more memory
If you have more data than physical memory, the system might store it temporarily on the hard disk should it not be needed at the moment. The process of moving data to and from the hard disk like this is called swapping, as the data is "swapped" in and out. Typically, when you install the system, you define a specific partition as the swap partition, or swap "space". However, Linux can also swap to a physical file, although with older Linux versions this is much slower than a special partition. An old rule of thumb is that you have at least as much swap space as you do physical RAM, this ensures that all of the data can be swapped out, if necessary. You will also find that some texts say that you should have at least twice as much swap as physical RAM. We go into details on swap in the section in installing and upgrading.

Files and Directories
There are three kinds of files with which most people are familiar: programs, text files, and data files. However, on a UNIX system, there are other kinds of files. One of the most common is a device file. These are often referred to as device files or device nodes. Under UNIX, every device is treated as a file. Access is gained to the hardware by the operating system through the device files. These tell the system what specific device driver needs to be used to access the hardware.

Another kind of file is a pipe. Like a real pipe, stuff goes in one end and out the other. Some are named pipes. That is, they have a name and are located permanently on the hard disk. Others are temporary and are unnamed pipes. Although these do not exist once the process using them has ended, they do take up
physical space on the hard disk

Under Linux, a directory is actually nothing more than a file itself with a special format. It contains the names of the files associated with it and some pointers or other information to tell the system where the data for the file actually reside on the hard disk.

The directories have information that points to where the real files are.
One kind of file is a directory. What this kind of file can contain are files and more directories. These, in turn, can contain still more files and directories. The result is a hierarchical tree structure of directories, files, more directories, and more files. Directories that contain other directories are referred to as the parent directory of the child or subdirectory that they contain. (Most references I have seen refer only to parent and subdirectories. Rarely have I seen references to child directories.)

When referring to directories under UNIX, there is often either a leading or trailing slash ("/"), and sometimes both. The top of the directory tree is referred to with a single "/" and is called the "root" directory. Subdirectories are referred to by this slash followed by their name, such as /bin or /dev.

One thing to note is that John's business letter to Chris may be the exact same file as Jim's. I am not talking about one being a copy of the other. Rather, I am talking about a situation where both names point to the same physical locations on the hard disk. Because both files are referencing the same bits on the disk, they must therefore be the same file.
This is accomplished through the concept of a link. Like a chain link, a file link connects two pieces together. Take an example of "telephone number" for a file was its inode. This number actually points to a special place on the disk called the inode table, with the inode number being the offset into this table. Each entry in this table not only contains the file's physical location on this disk, but the owner of the file, the access permissions, and the number of links, as well as many other things. In the case where the two files are referencing the same entry in the inode table, these are referred to as hard links. A soft link or symbolic link is where a file is created that contains the path of the other file. We will get into the details of this later.
An inode does not contain the name of a file. The name is only contained within the directory. Therefore, it is possible to have multiple directory entries that have the same inode. Just as there can be multiple entries in the phone book, all with the same phone number. We'll get into a lot more detail about inodes in the section on filesystems. A directory and where the inodes point to on the hard disk might look like this

Let's think about the telephone book analogy once again. Although it is not common for an individual to have multiple listings, there might be two people with the same number. For example, if you were sharing a house with three of your friends, there might be only one telephone. However, each of you would have an entry in the phone book. I could get the same phone to ring by dialing the telephone number of four different people. I could also get the same inode with four different file names.

Under Linux, files and directories are grouped into units called filesystems. A filesystem is a portion of your hard disk that is administered as a single unit. Filesystems exist within a section of the hard disk called a partition. Each hard disk can be broken down into multiple partitions and the filesystem is created within the partition. Each has specific starting and ending points that are managed by the system. (Note: Some dialects of UNIX allow multiple filesystems within a partition.)

Operating System Layers
What accesses the hardware is a set of functions within the operating system itself (the kernel) called device drivers. If it does not behave correctly, a device driver has the potential of wiping out data on your hard disk or "crashing" your system. Because a device driver needs to be sure that it has properly completed its task (such as accurately writing or reading from the hard disk), it cannot quit until it has finished

Under Linux, there are many sets of programs that serve common functions. This includes things like mail or printing. These groups of related programs are referred to as "System Services", whereas individual programs such as vi or fdisk are referred to as utilities. Programs that perform a single function such as ls or date are typically referred to as commands.
The top-most directory is the root directory. In verbal conversation, you say "root directory" or "slash," whereas it may be referred to in text as simply "/."
The first directory we get to is /bin. Its name is derived from the word "binary." Often, the word "binary" is used to refer to executable programs or other files that contains non-readable characters. The /bin directory is where many of the system-related binaries are kept, hence the name. Although several of the files in this directory are used for administrative purposes and cannot be run by normal users, everyone has read permission on this directory, so you can at least see what the directory contains.
The /boot directory is used to boot the system. There are several files here that the system uses at different times during the boot process. For example, the files /boot/boot.???? are copies of the original boot sector from your hard disk. (for example boot.0300) Files ending in .b are "chain loaders," secondary loaders that the system uses to boot the various operating systems that you specify.

The /dev directory contains the device nodes. As I mentioned in our previous discussion on operating system basics, device files are the way both the operating system and users gain access to the hardware. Every device has at least one device file associated with it. If it doesn't, you can't gain access to it. We'll get into more detail on individual device files later.
The /etc directory contains files and programs that are used for system configuration. Its name comes from the common abbreviation etc., for etcetera, meaning "and so on." This seems to come from the fact that on many systems, /etc contains files that don't seem to fit elsewhere.
There several directories named /etc/cron*. As you might guess these are used by the cron daemon. The /etc/cron.d contains configuration files used by cron. Typically what is here are various system related cron jobs, such as /etc/cron.d/seccheck, which does various security checks. The directories /etc/cron.hourly, /etc/cron.daily, /etc/cron.weekly, /etc/cron.monthly contain files with cron jobs which run hourly, daily, weekly and monthly, respectively. There is a cron job listed in /etc/crontab that runs the program /usr/lib/cron/run-crons, which checks the other files.
The /lib directory (for library) contains the libraries needed by the operating system as it is running. You will also find several sub directories.
The /usr directory contains many user-related subdirectories. Note the 'e' is missing from "user". In general, one can say that the directories and files under /usr are used by and related to users. There are programs and utilities here that users use on a daily basis. Unless changed on some systems, /usr is where users have their home directory. The figure below shows what the subdirectories of /usr would look like graphically.
Where /bin contains programs that are used by both users and administrators, /usr/bin contains files that are almost exclusively used by users. (However, like everything in UNIX, there are exceptions.) Here again, the bin directory contains binary files. In general, you can say the the programs and utilities that all user more or less require as stored in bin, whereas the "nice-to-have" programs and utilities are stored in /usr/bin. Programs and utilities needs for administrative tasks are stored in /sbin. Note that is common to seperate files like this, but it is not an absolute.
The /usr/src directory contains the source code for both the Linux kernel and for any program that you specifically install.
Many versions of Linux are now using the Red Hat Package Manager (RPM) format. In fact, RPM is perhaps the format most commonly found on the Internet. Most sites will have new or updated programs as RPM files. You can identify this format by the rpm extension to the file name.
This has proven itself to be a much more robust mechanism for adding and removing packages, as it is much easier to add and manage single programs than with Slackware. We'll get into more detail about this when I talk about installing. You will also find that RPM packages are also grouped into larger sets like those in Slackware, so the concepts are the same.

kill
By default, the kill command sends a termination signal to that process. Unfortunately, there are some cases where a process can ignore that termination signal. However, you can send a much more urgent "kill" signal like this:

kill -9
Where "9" is the number of the SIGKILL or kill signal. In general, you should first try to use signal 15 or SIGTERM. This sends a terminate singal and gives the process a chance to end "gracefully". You should also look to see if the process you want to stop has any children.

In some circumstances, it is not easy to kill processes by their PID. For example, if something starts dozens of other processes, it is ineffective to try to input all of their PIDs. To solve this problem Linux has the killall command and takes the command name instead of the PID. You can also use the -i, --interactive option to interactively ask you if the process should be kill or the -w, --wait option to wait for all killed processes to die. Note that if processed ignores the signal or if it is a zombie, then killall may end up waiting forever.

The Shell
As I mentioned in the section on introduction to operating systems, the shell is essentially a user's interface to the operating system. The shell is a command line interpreter, just like other operating systems. In Windows you open up a "command window" or "DOS box" to input commands, which is nothing other
than a command line interpreter. Through it, you issue commands that are interpreted by the system to carry out certain actions. Often, the state where the system is sitting at a prompt, waiting for you to type input, is referred to (among other things) as being at the shell prompt or at the command line.
The current directory is referenced by "." and its parent by ".." (often referred to in conversation as "dot" and "dot-dot").

Permissions
Permissions are set on a file using the chmod command or when the file is created (the details of which I will save for later). You can read the permissions on a file by using either the l command or ls -l. At the beginning of each line will be ten characters, which can either be dashes or letters. The first position is the type of the file, whether it is a regular file (-), a directory (d), a block device file (b), and so on. Below are some examples of the

various file types.
- - regular file
c - character device
b - block device
d - directory
p - named pipe
l - symbolic link

File and Directory Basics
Command Function
cd change directory
cp copy files
file determine a file's contents
ls list files or directories
ln make a link to a file
mkdir make a directory
mv move (rename) a file
rm remove a file
rmdir remove a directory

File Viewing
Command Function
cat Display the contents of file
less Page through files
head show the top portion of a file
more display screenfuls of a file
tail display bottom portion of a file
nl count the number of lines in a file
wc count the number of lines, words and characters in a file
od View a binary file
tee display output on stdout and write it to a file simultaneously

File Management
Command Function
ls display file attributes
stat display file attributes
wc count the number of lines, words and characters in a file
file identify file types
touch set the time stamp of a file or directory
chgrp change the group of a file
chmod change the permissions (mode) of a file
chown change the owner of a file
chattr change advanced file attributes
lsattr display advanced file attributes

File Manipulation
Command Function
awk pattern-matching, programming language
csplit split a file
cut display columns of a file
paste append columns in a file
dircmp compare two directories
find find files and directories
perl scripting language
sed Stream Editor
sort sort a file
tr translate chracters in a file
uniq find unique or repeated lines in a file
xargs process multiple arguements

File Editing
Command Function
vi text editor
emacs text editor
sed Stream Editor
Locate Files

Command Function
find find files and directories
which locate commands within your search path
whereis locate standard files
File Compression and Archiving

Command Function
gzip compress a file using GNU Zip
gunzip uncompress a file using GNU Zip
compress compress a file using UNIX compress
uncompress uncompress a file using UNIX compress
bzip2 compress a file using block-sorting file compressor
bunzip2 uncompress a file using block-sorting file compressor
zip compress a file using Windows/DOS zip
unzip uncompress a file using Windows/DOS zip
tar read/write (tape) archives
cpio copy files to and from archives
dump dump a disk to tape
restore restore a dump
mt tape control programm
File Comparison

Command Function
diff find differences in two files
cmp compare two files
comm compare sorted files
md5sum compute the MD5 checksum of a file
sum compute the checksum of a file
Disks and File Systems

Command Function
df display free space
du display disk usage
mount mount a filesystem
fsck check aand repair a filesystem
sync Flush disk caches
Printing

Command Function
lpr print files
lpq view the print queue
lprm Remove print jobs
lpc line printer control program
Process Management

Command Function
ps list processes
w list users' processes
uptime view the system load, amount of time it has been running, etc.
top monitor processes
free display free memory
kill send signals to processes
killall kill processes by name
nice set a processes nice value
renice set the nice value of a running process.
at run a job at a specific time
crontab schedule repeated jobs
batch run a job as the system load premits
watch run a programm at specific intervals
sleep wiat for a specified interval of time
Host Information

Command Function
uname Print system information
hostname Print the system's hostname
ifconfig Display or set network interface configuration
host lookup DNS information
nslookup lookup DNS information (deprecated)
whois Lookup domain registrants
ping Test reachability of a host
traceroute Display network path to a host
Networking Tools

Command Function
ssh Secure remote access
telnet Log into remote hosts
scp Securely copy files between hosts
ftp Copy files between hosts
wget Recursively download files from a remote host
lynx Character based web-browser

What is a process?
A process is a program in execution. The components of a process are: the program to be executed, the
data on which the program will execute, the resources required by the program—such as memory and file(s)—and the status of the execution.
Is a process the same as a program? No!, it is both more and less.
• more—a program is just part of a process context.
tar can be executed by two different people—same program (shared code) as part of different processes.
• less—a program may invoke several processes.cc invokes cpp, cc1, cc2, as, and ld.

Programming: uni- versus multi-
Some systems allow execution of only one process at a time (e.g., early personal computers).
They are called uniprogramming systems.
Others allow more than one process, i.e., concurrent execution of many processes. They are called multiprogramming (NOT multiprocessing!) systems.
In a multiprogramming system, the CPU switches automatically from process to process running each
for tens or hundreds of milliseconds. In reality, the CPU is actually running one and only one process at a time.

Process states
There are a number of states that can be attributed to a process: indeed, the operation of a multiprogramming system can be described by a state transition diagram on the process states. The states of a process include:
• New—a process being created but not yet included in the pool of executable processes (resource acquisition).
• Ready—processes that are prepared to execute when given the opportunity.
• Active—the process that is currently being executed by the CPU.
• Blocked—a process that cannot execute until some event occurs.
• Stopped—a special case of blocked where the process is suspended by the operator or the user.
• Exiting—a process that is about to be removed from the pool of executable processes (resource release).

Threads
Unit of execution (unit of dispatching) and a collection of resources, with which the unit of execution is associated, characterize the notion of a process.
A thread is the abstraction of a unit of execution. It is also referred to as a light-weight process (LWP) .
As a basic unit of CPU utilization, a thread consists of an instruction pointer (also referred to as the PC or
instruction counter), a CPU register set and a stack.A thread shares its code and data, as well as system
resources and other OS related information, with its peer group (other threads of the same process).

Pages

Wednesday, September 26, 2012

Netflow

Sunday, September 9, 2012

Linux in brief!!