sFlow: analyzer

Showing posts with label analyzer. Show all posts

Wednesday, November 20, 2024

SC24 Real-time RoCEv2 traffic visibility

The chart shows eight 400Gbits/s RDMA over Converged Ethernet (RoCEv2) flows, typically seen in AI / ML data centers, totaling 3.2 Tbits/s. The unique challenge in this case is that flows are being routed from locations scattered around the United States to Atlanta, the location of the International Conference for High Performance Computing, Networking, Storage, and Analysis (SC24) conference.

SC24 Network Research Exhibit: The Resiliant, Performant Networks and Distributed Processing demonstration aims to explore performance limitations and enablers for high volume bulk data tranfers. Maintaining stable 400Gbits/s RoCEv2 connections over a wide area network is challenging since the packets have to traverse multiple links, avoid contention on links, and deal with buffering associated with transmission latency that is orders of magnitude higher than data center environments where RoCEv2 is typically deployed (one way latency across the USA is a minimum of 16 milliseconds due to speed of light, but in practice the latency is quite a bit larger, on the other hand latency across a leaf and spine data center fabric is measured in microseconds).

During setup it was noticed that total throughput with 8 concurrent flows was only 2.7Tbits/s (instead of the 3Tbits/second plus expected). Examining a real-time view of the throughput revealed that the two smallest flows, pink and light green at the top of the chart, were likely sharing a 400Gbits path since each flow was only transferring 200Gbps. The next flow down, light blue, appeared to be unstable and wasn't maintaining a constant 400Gbps.

Drilling down to look at the unstable flow showed that it was oscilating between 280Gbits/s and 400Gbits/s with a period of around 15 seconds. Further investigation revealed that the cause of the instability was a collision with a smaller flow on one of the links traversed by this flow. Once the flow collisions were resolved, all flows achieved close to 400Gbit/s, allowing the full 3Tbits/s transfer rate shown at the top of this article.

In this example, the sFlow-RT real-time analytics engine receives sFlow telemetry from switches, routers, and servers in the SCinet network and creates metrics to drive the real-time charts. Getting Started provides a quick introduction to deploying and using sFlow-RT for real-time network-wide flow analytics.

Real-time network visibility is particularly relevant to AI / ML data center networks where congestion and dropped packets can result in serious performance degredation of machine learning tasks. Industry standard sFlow instrumentation is supported by the high speed 400/800G switches currently being deployed in AI / ML data centers. Enabling sFlow analytics provides the visibility needed to optimize performance.

Network visibility complements existing system management tools used to provide visibility into compute nodes, extending visibility into the fabric to directly observe problems in the network that can't easily be inferred from the compute nodes, and providing a second pair of eyes with an independent view of performance.

Finally, check out the SC24 Dropped packet visibility demonstration to learn about one of newest developments in sFlow monitoring and see a live demonstration.

Wednesday, May 8, 2019

Secure forwarding of sFlow using ssh

Typically sFlow datagrams are sent unencrypted from agents embedded in switches and routers to a local collector/analyzer. Sending sFlow datagrams over the management VLAN or out of band management network generally provides adequate isolation and security within the site. Inter-site traffic within an organization is typically carried over a virtual private network (VPN) which encrypts the data and protects it from eavesdropping.

This article describes a simple method of carrying sFlow datagrams over an encrypted ssh connection which can be useful in situations where a VPN is not available, for example, sending sFlow to an analyzer in the public cloud, or to an external consultant.

The diagram shows the elements of the solution. A collector on the site receives sFlow datagrams from the network devices and uses the sflow_fwd.py script to convert the datagrams into line delimited hexadecimal strings that are sent over an ssh connection to another instance of sflow_fwd.py running on the analyzer that converts the hexadecimal strings back to sFlow datagrams.

The following sflow_fwd.py Python script accomplishes the task:

#!/usr/bin/python

import socket
import sys
import argparse

parser = argparse.ArgumentParser(description='Serialize/deserialize sFlow')
parser.add_argument('-c', '--collector', default='')
parser.add_argument('-s', '--server')
parser.add_argument('-p', '--port', type=int, default=6343)
args = parser.parse_args()

sock=socket.socket(socket.AF_INET,socket.SOCK_DGRAM)

if(args.server != None):
  while True:
    line = sys.stdin.readline()
    if not line:
      break
    buf = bytearray.fromhex(line[:-1])
    sock.sendto(buf, (args.server, args.port))
else: 
  sock.bind((args.collector,args.port))
  while True:
    buf = sock.recv(2048)
    if not buf:
      break
    print buf.encode('hex')
    sys.stdout.flush()

Create a user account on both the collector and analyzer machines, in this example the user is pp. Next copy the script to both machines.

If you log into the collector machine, following command will send sFlow to the analyzer machine:

./sflow_fwd.py | ssh pp@analyzer './sflow_fwd.py -s 127.0.0.1'

If you log into the analyzer machine, the following command will retrieve sFlow from the collector machine:

ssh pp@collector './sflow_fwd.py' | ./sflow_fwd.py -s 127.0.0.1

If a permanent connection is required, it is relatively straightforward to create a daemon using systemd. In this example, the service is being installed on the collector machine by performing the following steps:
First log into the collector generate an ssh key:

ssh-keygen

Next, install the key on the analyzer system:

ssh-copy-id pp@analyzer

Now create the systemd service file, /etc/systemd/system/sflow-tunnel.service:

[Unit]
Description=sFlow tunnel
After=network.target

[Service]
Type=simple
User=pp
ExecStart=/bin/sh -c "/home/pp/sflow_fwd.py | /usr/bin/ssh pp@analyzer './sflow_fwd.py -s 127.0.0.1'"
Restart=on-failure
RestartSec=30

[Install]
WantedBy=multi-user.target

Finally, use the systemctl command to enable and start the daemon:

sudo systemctl enable sflow-tunnel.service
sudo systemctl start sflow-tunnel.service

A simple way to confirm that sFlow is arriving on the analyzer machine is to use sflowtool.

There are numerous articles on this blog describing how the sFlow-RT analytics software can be used to integrate sFlow telemetry with popular metrics and SIEM (security information and event management) tools.

Monday, August 20, 2018

RDMA over Converged Ethernet (RoCE)

RDMA over Converged Ethernet is a network protocol that allows remote direct memory access (RDMA) over an Ethernet network. One of the benefits running RDMA over Ethernet is the visibility provided by standard sFlow instrumentation embedded in the commodity Ethernet switches used to build data center leaf and spine networks where RDMA is most prevalent.

The sFlow telemetry stream includes packet headers, sampled at line rate by the switch hardware. Hardware packet sampling allows the switch to monitor traffic at line rate on all ports, keeping up with the high speed data transfers associated with RoCE.

The diagram above shows the packet headers associated with RoCEv1 and RoCEv2 packets. Decoding the InfiniBand Global Routing Header (IB GRH) and InfiniBand Base Transport Header (IB BTH) allows an sFlow analyzer to report in detail on RoCE traffic.

The sFlow-RT real-time analytics engine recently added support for RoCE by decoding InfiniBand Global Routing and InfiniBand Base Transport fields. The screen capture of the sFlow-RT Flow-Trend application shows traffic associated with an RoCEv2 connection between two hosts, 10.10.2.22 and 10.10.2.52. The traffic consists of SEND and ACK messages exchanged as part of a reliable connection (RC).

The standard sFlow instrumentation provides comprehensive network wide visibility into RoCE and all other applications sharing the network resources. Real-time visibility is an essential part of automating networks, providing the feedback needed to ensure that resources are efficiently allocated and rapidly identifying overloaded resources so that remediation action can be taken before significant service degradation occurs.

Monday, April 2, 2018

Flow smoothing

The sFlow-RT real-time analytics engine includes statistical smoothing. The chart above illustrates the effect of different levels of smoothing when analyzing real-time sFlow telemetry.

The traffic generator in this example creates an alternating pattern: 1.25Mbytes/second for 30 seconds followed by a pause of 30 seconds. Smoothing time constants between 1 second and 500 seconds have been applied to generate the family of charts. The blue line is the result of 1 second smoothing and closely tracks the traffic pattern. At the other extreme, the dark red line is the result of 500 second smoothing, showing a constant 625Kbytes/second (the average of the waveform).

There is a tradeoff between responsiveness and variability (noise) when selecting the level of smoothing. Selecting a suitable smoothing level depends on the flow analytics application.

Low smoothing values are appropriate when fast response is required, for example:

Higher smoothing values are appropriate when less variability is desirable, for example:

Generating the chart

The results described in this article are easily reproduced using the testbed described in Mininet flow analytics.

The following, smoothing.js, script defines a set of flows with different smoothing periods:

var times = [1,2,5,10,20,50,100,200,500];
for(var i = 0; i < times.length; i++) {
  setFlow('t_'+times[i],{value:'bytes',t:times[i]});
}

Start sFlow-RT:

env RTPROP=-Dscript.file=smoothing.js ./start.sh

Start Mininet:

sudo mn --custom extras/sflow.py --link tc,bw=10

Type the following Mininet command to open terminals on simulated hosts, h1 and h2:

mininet> xterm h1 h2

In h2 terminal window:

iperf -s

In h1 terminal window:

while true; do iperf -c 10.0.0.2 -t 30; sleep 30; done

Plot the chart by opening the sFlow-RT URL:

http://sflow-rt:8008/metric/ALL/t_1,t_2,t_5,t_10,t_20,t_50,t_100,t_200,t_500/html

See Writing Applications for more information.

Thursday, November 17, 2016

Monitoring at Terabit speeds

The chart was generated from industry standard sFlow telemetry from the switches and routers comprising The International Conference for High Performance Computing, Networking, Storage and Analysis (SC16) network. The chart shows a number of conference participants pushing the network to see how much data they can transfer, peaking at a combined bandwidth of 3 Terabits/second over a minute just before noon and sustaining over 2.5 Terabits/second for over an hour. The traffic is broken out by MAC vendors code: routed traffic can be identified by router vendor (Juniper, Brocade, etc.) and layer 2 transfers (RDMA over Converged Ethernet) are identified by host adapter vendor codes (Mellanox, Hewlett-Packard Enterprise, etc.).

From the SCinet web page, "The Fastest Network Connecting the Fastest Computers: SC16 will host the most powerful and advanced networks in the world – SCinet. Created each year for the conference, SCinet brings to life a very high-capacity network that supports the revolutionary applications and experiments that are a hallmark of the SC conference."

SC16 live real-time weathermaps provides additional demonstrations of high performance network monitoring.

Sunday, November 13, 2016

SC16 live real-time weathermaps

Connect to https://inmon.sc16.org/sflow-rt/app/sc16-weather/html/ between now and November 17th to see a real-time heat map of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC16) network.

From the SCinet web page, "The Fastest Network Connecting the Fastest Computers: SC16 will host the most powerful and advanced networks in the world – SCinet. Created each year for the conference, SCinet brings to life a very high-capacity network that supports the revolutionary applications and experiments that are a hallmark of the SC conference."

The real-time weathermap leverages industry standard sFlow instrumentation built into network switch and router hardware to provide scaleable monitoring of the SCinet network. Link colors are updated every second to reflect operational status and utilization of each link.

Clicking on a link in the map pops up a 1 second resolution strip chart showing the protocol mix carried by the link.

OSiRIS (Open Storage Research Infrastructure) is a "distributed, multi-institutional storage infrastructure that lets researchers write, manage, and share data from their own computing facility locations."

Connect to http://inmon.sc16.org/sflow-rt/app/OSiRIS-weather/html/ to see an animated diagram of the SC16 OSiRIS demonstration connecting SCinet with University of Michigan, Michigan State, Wayne State, Indiana University, USGS, and Utah Cloudlab. Click on any of the links in the diagram to see traffic.

Connect to https://inmon.sc16.org/sflow-rt/app/world-map/html/ to see a real-time view of traffic from SCinet to different countries.

The SCinet real-time weathermaps were constructed using open source components (https://github.com/pphaal/sc15-weather, https://github.com/sflow-rt/svg-weather, https://github.com/sflow-rt/dashboard-example, and https://github.com/sflow-rt/world-map) running on a single instance of the sFlow-RT real-time analytics engine. See Writing Applications and download sFlow-RT to see what you can build.

Thursday, October 13, 2016

Real-time domain name lookups

Reverse DNS requests request the domain name associated with an IP address, for example providing the name google-public-dns-a.google.com for IP address 8.8.8.8. This article demonstrates how the sFlow-RT engine incorporates domain name lookups in real-time flow analytics.

First, use the dns.servers System Property is used to specify one or more DNS servers to handle the reverse lookup requests. For example, the following command uses Docker to run sFlow-RT with DNS lookups directed to server 10.0.0.1:

docker run -e "RTPROP=-Ddns.servers=10.0.0.1" \
-p 8008:8008 -p 6343:6343/udp -d sflow/sflow-rt

The following Python script dnspair.py uses the sFlow-RT REST API to define a flow and log the resulting flow records:

#!/usr/bin/env python
import requests
import json

flow = {'keys':'dns:ipsource,dns:ipdestination',
 'value':'bytes','activeTimeout':10,'log':True}
requests.put('http://localhost:8008/flow/dnspair/json',data=json.dumps(flow))
flowurl = 'http://localhost:8008/flows/json?name=dnspair&maxFlows=10&timeout=60'
flowID = -1
while 1 == 1:
  r = requests.get(flowurl + "&flowID=" + str(flowID))
  if r.status_code != 200: break
  flows = r.json()
  if len(flows) == 0: continue

  flowID = flows[0]["flowID"]
  flows.reverse()
  for f in flows:
    print json.dumps(f,indent=1)

Running the script generates the following output:

$ ./dnspair.py
{
 "value": 233370.92322668363, 
 "end": 1476234478177, 
 "name": "dnspair", 
 "flowID": 1523, 
 "agent": "10.0.0.20", 
 "start": 1476234466195, 
 "dataSource": "10", 
 "flowKeys": "xenvm11.sf.inmon.com.,dhcp20.sf.inmon.com."
}
{
 "value": 39692.88754760739, 
 "end": 1476234478177, 
 "name": "dnspair", 
 "flowID": 1524, 
 "agent": "10.0.0.20", 
 "start": 1476234466195, 
 "dataSource": "10", 
 "flowKeys": "xenvm11.sf.inmon.com.,switch.sf.inmon.com."
}

The token dns:ipsource in the flow definition is an example of a Key Function. Functions can be combined to define flow keys or in filters.

or:[dns:ipsource]:ipsource

Returns a dns name if available, otherwise the original IP address is returned

suffix:[dns:ipsource]:.:3

Returns the last 2 parts of the DNS name, e.g. xenvm11.sf.inmon.com. becomes inmon.com.

DNS results are cached by the dns: function in order to provide real-time lookups and reduce the load on the backend name server(s). Cache size and timeout settings are tune-able using System Properties.

Wednesday, August 17, 2016

Real-time web analytics

The diagram shows a typical scale out web service with a load balancer distributing requests among a pool of web servers. The sFlow HTTP Structures standard is supported by commercial load balancers, including F5 and A10, and open source load balancers and web servers, including HAProxy, NGINX, Apache, and Tomcat.

The simplest way to try out the examples in this article is to download sFlow-RT and install the Host sFlow agent and Apache mod-sflow instrumentation on a Linux web server.

The following sFlow-RT metrics report request rates based on the standard sFlow HTTP counters:

http_method_option
http_method_get
http_method_head
http_method_post
http_method_put
http_method_delete
http_method_trace
http_method_connect
http_method_other
http_status_1xx
http_status_2xx
http_status_3xx
http_status_4xx
http_status_5xx
http_status_other
http_requests

In addition, mod-sflow exports the following standard thread pool metrics:

workers_active
workers_idle
workers_max
workers_utilization
req_delayed
req_dropped

Cluster performance metrics describes how sFlow-RT's REST API is used to compute summary statistics for a pool of servers. For example, the following query calculates the cluster wide total request rates:

http://localhost:8008/metric/ALL/sum:http_method_get,sum:http_method_post/json

More interesting is that the sFlow telemetry stream also includes randomly sampled HTTP request records with the following attributes:

protocol
serveraddress
serveraddress6
serverport
clientaddress
clientaddress6
clientport
proxyprotocol
proxyserveraddress
proxyserveraddress6
proxyserverport
proxyclientaddress
proxyclientaddress6
proxyclientport
httpmethod
httpprotocol
httphost
httpuseragent
httpxff
httpauthuser
httpmimetype
httpurl
httpreferer
httpstatus
bytes
req_bytes
resp_bytes
duration
requests

The sFlow-RT analytics pipeline is programmable. Defining Flows describes how to compute additional metrics based on the sampled requests. For example, the following flow definition creates a new metric called image_bytes that tracks the volume of image data in HTTP responses as a bytes/second value calculated over a 10 second window:

setFlow('image_bytes', {value:'resp_bytes',t:10,filter:'httpmimetype~image/.*'});

The new metric can be queries in exactly the same way as the counter based metrics above, e.g.:

http://localhost:8008/metric/ALL/sum:image_bytes/json

The uri: function is used to extract parts of the httpurl or httpreferer URL fields. The following attributes can be extracted:

normalized
scheme
user
authority
host
port
path
file
extension
query
fragment
isabsolute
isopaque

For example, the following flow definition creates a metric called game_reqs that tracks the requests/second hitting the URL path with prefix /games:

setFlow('games_reqs', {value:'requests',t:10,filter:'uri:httpurl:path~/games/.*'});

Define flow keys to identify slowest requests, most popular URLs, etc. For example, the following definition tracks the top 5 longest duration requests:

setFlow('slow_reqs', {keys:'httpurl',value:'duration',t:10,n:5});

The following query retrieves the result:

$ curl "http://localhost:8008/activeflows/ALL/slow_reqs/json?maxFlows=5"
[
 {
  "dataSource": "3.80",
  "flowN": 1,
  "value": 117009.24305622398,
  "agent": "10.0.0.150",
  "key": "/login.php"
 },
 {
  "dataSource": "3.80",
  "flowN": 1,
  "value": 7413.476263017302,
  "agent": "10.0.0.150",
  "key": "/games/animals.php"
 },
 {
  "dataSource": "3.80",
  "flowN": 1,
  "value": 4486.286259806839,
  "agent": "10.0.0.150",
  "key": "/games/puzzles.php"
 },
 {
  "dataSource": "3.80",
  "flowN": 1,
  "value": 2326.33482623333,
  "agent": "10.0.0.150",
  "key": "/sales/buy.php"
 },
 {
  "dataSource": "3.80",
  "flowN": 1,
  "value": 276.3486100676183,
  "agent": "10.0.0.150",
  "key": "/index.php"
 }
]

Sampled records are a useful complement to counter based metrics, making it possible to disaggregate counts and identify root causes. For example, suppose a spike in errors is identified through the http_status_4xx or http_status_5xx metrics. The following flow definition breaks out the most frequent failed requests by specific URL and error code:

setFlow('err_reqs', {keys:'httpurl,httpstatus',value:'requests',t:10,n:5,
  filter:'range:httpstatus:400=true'});

Finally, the real-time HTTP analytics don't exist in isolation. The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Friday, July 1, 2016

Real-time BGP route analytics

The diagram shows how sFlow-RT real-time analytics software can combine BGP route information and sFlow telemetry to generate route analytics. Merging sFlow traffic with BGP route data significantly enhances both data streams:

sFlow real-time traffic data identifies active BGP routes
BGP path attributes are available in flow definitions

The following example demonstrates how to configure sFlow / BGP route analytics. In this example, the switch IP address is 10.0.0.253, the router IP address is 10.0.0.254, and the sFlow-RT address is 10.0.0.162.

Setup

First download sFlow-RT. Next create a configuration file, bgp.js, in the sFlow-RT home directory with the following contents:

var reflectorIP  = '10.0.0.254';
var myAS         = '65162';
var myID         = '10.0.0.162';
var sFlowAgentIP = '10.0.0.253';

// allow BGP connection from reflectorIP
bgpAddNeighbor(reflectorIP,myAS,myID);

// direct sFlow from sFlowAgentIP to reflectorIP routing table
// calculate a 60 second moving average byte rate for each route
bgpAddSource(sFlowAgentIP,reflectorIP,60,'bytes');

The following sFlow-RT System Properties load the configuration file and enable BGP:

script.file=bgp.js
bgp.start=yes

Start sFlow-RT and the following log lines will confirm that BGP has been enabled and configured:

$ ./start.sh 
2016-06-28T13:14:34-0700 INFO: Listening, BGP port 1179
2016-06-28T13:14:35-0700 INFO: Listening, sFlow port 6343
2016-06-28T13:14:35-0700 INFO: Starting the Jetty [HTTP/1.1] server on port 8008
2016-06-28T13:14:35-0700 INFO: Starting com.sflow.rt.rest.SFlowApplication application
2016-06-28T13:14:35-0700 INFO: Listening, http://localhost:8008
2016-06-28T13:14:36-0700 INFO: bgp.js started
2016-06-28T13:14:36-0700 INFO: bgp.js stopped

Configure the switch (10.0.0.253) to send sFlow to the sFlow-RT instance(10.0.0.162), see Switch configurations for vendor specific configurations. Check the sFlow-RT /agents/html page to verify that sFlow telemetry is being received from the agent.

Next, configure the router (10.0.0.254) to reflect BGP routes to the sFlow-RT instance (10.0.0.162):

router bgp 65254
 bgp router-id 10.0.0.254
 neighbor 10.0.0.162 remote-as 65162
 neighbor 10.0.0.162 port 1179
 neighbor 10.0.0.162 timers connect 30
 neighbor 10.0.0.162 route-reflector-client
 neighbor 10.0.0.162 activate

The following sFlow-RT log entry confirms that a BGP session has been established:

2016-06-28T13:20:17-0700 INFO: BGP open 10.0.0.254 53975

Query active routes

The following cURL command uses the REST API to identify the top 5 IPv4 prefixes ranked by traffic (measured in bytes/second):

curl "http://10.0.0.162:8008/bgp/topprefixes/10.0.0.254/json?maxPrefixes=5
{
 "as": 65254,
 "direction": "destination",
 "id": "10.0.0.254",
 "learnedPrefixesAdded": 691838,
 "learnedPrefixesRemoved": 0,
 "nPrefixes": 691838,
 "pushedPrefixesAdded": 0,
 "pushedPrefixesRemoved": 0,
 "startTime": 1467322582093,
 "state": "established",
 "topPrefixes": [
  {
   "aspath": "NNNN-NNNN-NNNNN-NNNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NN.0/24",
   "value": 9.735462342126082E7
  },
  {
   "aspath": "NNN-NNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NNN.0/24",
   "value": 7.347515546153101E7
  },
  {
   "aspath": "NNNN-NNNNNN-NNNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NN.NNN.NN.N/24",
   "value": 4.26137765317916E7
  },
  {
   "aspath": "NNNN-NNNN-NNNN",
   "localpref": 100,
   "med": 1,
   "nexthop": "NNN.NNN.NNN.N",
   "origin": "IGP",
   "prefix": "NNN.NN.NNN.0/24",
   "value": 2.6633190792947102E7
  },
  {
   "aspath": "NNNN-NNN-NNNNN",
   "localpref": 100,
   "med": 10001,
   "nexthop": "NNN.NNN.NNN.NN",
   "origin": "IGP",
   "prefix": "NN.NNN.NNN.0/24",
   "value": 1.5500941476103483E7
  }
 ],
 "valuePercentCoverage": 71.38452058755995,
 "valueTopPrefixes": 2.55577687683634E8,
 "valueTotal": 3.5802956380458355E8
}

In addition to returning the top prefixes, the query returns information about the amount of traffic covered by these prefixes. In this case, the valuePercentageCoverage of 71.38 indicates that 71.38% of the traffic is covered by the top 5 prefixes.

Note: Identifying numeric digits have been substituted with the letter N to protect privacy.

Additional arguments can be used to refine the top prefixes query:

maxPrefixes, maximum number of prefixes in the result
minValue, only include entries with a value greater than the threshold
direction, specify "ingress" for traffic arriving from remote networks and "egress" for traffic destined for remote networks
minPrefix, exclude shorter prefixes, e.g. minPrefix=1 would exclude 0.0.0.0/0.
includeCovered, set to "true" to also include prefixes that are covered by the top prefix, but wouldn't otherwise make the list. For example, if 10.1.0.0/16 was included, then 10.1.3.0/24 would also be included if it were in the set of prefixes advertised by the router.
pruneCovered, set to "true" to eliminate covered prefixes that share the same next hop.

IPv6 prefixes an be queried using /bgp/topprefixes6/{router}/json, which takes the same arguments as the topprefixes query shown above.

Writing Applications, describes how to build analytics driven controller applications using sFlow-RT's REST and embedded JavaScript APIs. For example, SDN router using merchant silicon top of rack switch, White box Internet router PoC, and Active Route Manager demonstrate how real-time identification of active routes can be used to efficiently manage limited hardware resources in commodity white box switches in order to handle a full Internet routing table of over 600,000 routes.

Defining Flows

The following flow attributes learned from the BGP session are merged with sFlow data received from switch 10.0.0.253:

ipsourcemaskbits
ipdestinationmaskbits
bgpnexthop
bgpnexthop6
bgpas
bgpsourceas
bgpsourcepeeras
bgpdestinationas
bgpdestinationpeeras
bgpdestinationaspath
bgpcommunities
bgplocalpref

The sFlow-RT /flowkeys/html page can be queried to verify that the attributes have been merged and to see the full set of attributes that are available from the sFlow feed.

Writing Applications describes how to program sFlow-RT flow caches, using the flow keys to select and identify traffic flows. For example, the following Python script uses the REST API to identify the source networks associated with a UDP amplification DDoS attack:

#!/usr/bin/env python
import requests
import json

// DNS port
reflector_port = '53'

max_pps = 100000

rest = 'http://localhost:8008'

# define flow
flow = {'keys':'mask:ipsource,bgpsourceas',
 'filter':'udpsourceport='+reflector_port,
 'value':'frames'}
requests.put(rest+'/flow/ddos/json',data=json.dumps(flow))

# set threshold
threshold = {'metric':'ddos', 'value': max_pps, 'byFlow':True}
requests.put(rest+'/threshold/ddos/json',data=json.dumps(threshold))

# tail even log
eventurl = rest+'/events/json?thresholdID=ddos&maxEvents=10&timeout=60'
eventID = -1
while 1 == 1:
  r = requests.get(eventurl + "&eventID=" + str(eventID))
  if r.status_code != 200: break
  events = r.json()
  if len(events) == 0: continue

  eventID = events[0]["eventID"]
  events.reverse()
  for e in events:
    print e['flowKey']

Running the script generates a log of the source network and AS number that exceed 100,000 packets per second of DNS response traffic (again, identifying numeric digits have been substituted with the letter N to protect privacy):

$ ./ddos.py 
NNN.NNN.0.0/13,NNNN
NNN.NNN.NNN.NNN/27,NNNN
NNN.NN.NNN.NNN/28,NNNNN
NNN.NNN.NN.0/24,NNNNN

A variation on the script can be used to identify large "Elephant" flows and their destination AS paths (showing the list of networks that packets traverse en route to their destination):

#!/usr/bin/env python
import requests
import json

max_Bps = 1000000000/8

rest = 'http://localhost:8009'

# define flow
flow = {
 'keys':'ipsource,ipdestination,tcpsourceport,tcpdestinationport,bgpdestinationaspath',
 'value':'bytes'}
requests.put(rest+'/flow/elephant/json',data=json.dumps(flow))

# set threshold
threshold = {'metric':'elephant', 'value': max_Bps, 'byFlow':True}
requests.put(rest+'/threshold/elephant/json',data=json.dumps(threshold))

# tail even log
eventurl = rest+'/events/json?thresholdID=elephant&maxEvents=10&timeout=60'
eventID = -1
while 1 == 1:
  r = requests.get(eventurl + "&eventID=" + str(eventID))
  if r.status_code != 200: break
  events = r.json()
  if len(events) == 0: continue

  eventID = events[0]["eventID"]
  events.reverse()
  for e in events:
    print e['flowKey']

Running the script generates real-time notification of the Elephant flows (flows exceeding 1Gbit/s) along with their destination AS paths:

$ ./elephant.py 
NNN.NN.NN.NNN,NNN.NNN.NN.NN,60789,25,NNNNN
NNN.NN.NNN.NN,NNN.NN.NN.NNN,443,38016,NNNNN-NNNNN-NNNNN-NNNNN
NN.NNN.NNN.NNN,NNN.NNN.NN.NN,37030,10059,NNNN-NNN-NNNN
NNN.NN.NN.NNN,NN.NN.NNN.NNN,34611,25,NNNN

SDN and large flows describes how a small number of Elephant flows typically consume most of the bandwidth, even though they are greatly outnumbered by small (Mice) flows. Dynamic policy based routing can targeted at Elephant flows to significantly improve performance and manage network resources: Leaf and spine traffic engineering using segment routing and SDN and WAN optimization using real-time traffic analytics are two examples.

Finally, the real-time BGP analytics don't exist in isolation. The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Thursday, June 16, 2016

Cisco Tetration analytics

Cisco Tetration Analytics: the most Comprehensive Data Center Visibility and Analysis in Real Time, at Scale, June 15, 2016, announced the new Cisco Tetration Analytics platform. The platform collects telemetry from proprietary agents on servers and embedded in hardware on certain Nexus 9k switches, analyzes the data, and presents results via Web GUI, REST API, and as events.

Cisco Tetration Analytics Data Sheet describes the hardware requirements:

Platform Hardware	Quantity
Cisco Tetration Analytics computing nodes (servers)	16
Cisco Tetration Analytics base nodes (servers)	12
Cisco Tetration Analytics serving nodes (servers)	8
Cisco Nexus 9372PX Switches	3

And the power requirements:

Property	Cisco Tetration Analytics Platform
Peak power for Cisco Tetration Analytics Platform (39-RU single-rack option)	22.5 kW
Peak power for Cisco Tetration Analytics Platform (39-RU dual-rack option)	11.25 kW per rack (22.5 KW Total)

No pricing is given, but based on the hardware, data center space, power and cooling requirements, this brute force approach to analytics will be reassuringly expensive to purchase and operate.

Update June 22, 2016: See 451 Research report, Cisco Tetration: a $3m, 1,700-pound appliance for network traffic analytics is born, for pricing information.

A much less expensive alternative is to use industry standard sFlow agents embedded in Cisco Nexus 9k/3k switches and in switches from over 40 other vendors. The open source Host sFlow agent extends visibility to servers and applications by streaming telemetry from Linux, Windows, FreeBSD, Solaris, and AIX operating system, hypervisors, Docker containers, web servers (Apache, NGINX, Tomcat, HAproxy) and Java application servers.

The diagram shows how the sFlow-RT real-time analytics engine receives a continuous telemetry stream from sFlow instrumentation build into network, server and application infrastructure and delivers analytics through APIs and can easily be integrated with a wide variety of on-site and cloud, orchestration, DevOps and Software Defined Networking (SDN) tools.

Minimizing cost of visibility describes why lightweight monitoring is critical to realizing the value that telemetry can bring to improving operational efficiency. In the case of the sFlow based solution, the critical data path instrumentation is built into the switch ASICs and in the Linux kernel, ensuring that there is negligible impact on operational performance.

The sFlow-RT analytics software shown in the diagram provides real-time (sub second) visibility for 5,000 unique end points (Virtual Machines or Bare metal server), the upper limit of scaleability in the Tetration data sheet, using a single virtual machine or Docker container with 4 GBytes of RAM and 4 CPU cores. With additional memory and CPU the solution easily scales to 100,000 unique end points.

How can sFlow provide real-time visibility at scale and consume so few resources? Shrink ray describes how advanced statistical techniques are used to select and analyze measurements that capture the essential features of network and system performance. A statistical approach yields fast, accurate answers, while minimizing the resources required to measure, transport and analyze the data.

The sFlow-RT analytics platform was selected as an example because of the overlap in capabilities with the Cisco Tetration analytics platform. However, sFlow is non-proprietary and there are many other open source and commercial sFlow analytics solutions listed on sFlow.org.

The Cisco press release states, "Available in July 2016, the first Tetration platform will be a full rack appliance that is deployed on-premise at the customer’s data center." On the other hand, the sFlow based solution described here is available today and can be installed and running in minutes on a virtual machine or in a Docker container.

Monday, February 1, 2016

SignalFx

SignalFx is an example of a cloud based analytics service. SignalFx provides a REST API for uploading metrics and a web portal that it simple to combine and trend data and build and share dashboards.

This article describes a proof of concept demonstrating how SignalFx's cloud service can be used to cost effectively monitor large scale cloud infrastructure by leveraging standard sFlow instrumentation. SignalFx offers a free 14 day trial, making it easy to evaluate solutions based on this demonstration.

The diagram shows the measurement pipeline. Standard sFlow measurements from hosts, hypervisors, virtual machines, containers, load balancers, web servers and network switches stream to the sFlow-RT real-time analytics engine. Metrics are pushed from sFlow-RT to SignalFx using the REST API.

Over 40 vendors implement the sFlow standard and compatible products are listed on sFlow.org. The open source Host sFlow agent exports standard sFlow metrics from hosts, virtual machines and containers and local services. For additional background, the Velocity conference talk provides an introduction to sFlow and case study from a large social networking site.

SignalFx's service is priced based on the number of data points that they need to store and they estimate a cost of $15 per host per month to record comprehensive host statistics at 10 second granularity. Collecting metrics from a cluster of 1,000 hosts would cost as much as $15,000 per month.

There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines are frequently added and removed. Instead, sFlow-RT tracks all the members of the pool, calculates summary statistics for the pool, and logs the summary statistics. This pre-processing can significantly reduce storage requirements, reducing costs and increasing query performance. The sFlow-RT analytics software also calculates traffic flow metrics, hot/missed Memcache keys, top URLs, exports events via syslog to Splunk, Logstash etc. and provides access to detailed metrics through its REST API.

The following steps were involved in setting up the proof of concept.

First register for free trial at SignalFx.com.

Download and install sFlow-RT.

Create a signalfx.js script in the sFlow-RT home directory with the following lines (use the token from your SignalFx account):

var url = "https://ingest.signalfx.com/v2/datapoint";
var token = "YOUR_APP_API_TOKEN";

setIntervalHandler(function() {
  var metrics = ['min:load_one','q1:load_one','med:load_one',
                 'q3:load_one','max:load_one'];
  var vals = metric('ALL',metrics,{os_name:['linux']});
  var gauges = [];
  for each (var val in vals) {
     gauges.push({
       metric: val.metricName.replace(/[^a-zA-Z0-9_]/g,"_"),
       dimensions:{cluster:"Linux"},
       value: val.metricValue
     });
  }
  var body = {"gauge":gauges};
  var req = {
    url:url,
    operation:'post',
    headers: {
      'Content-Type':'application/json',
      'X-SF-TOKEN':token
    },
    body: JSON.stringify(body)
  };
  try { http2(req); }
  catch(e) { logWarning("metric upload failed " + e); }
} , 10);

Add the following sFlow-RT configuration entry to load the script:

script.file=signalfx.js

Now start sFlow-RT. Cluster performance metrics describes the summary metrics that sFlow-RT can calculate. In this case, the load average minimum, maximum, and quartiles for the cluster are being calculated and pushed to SignalFx every minute.

Install Host sFlow agents on the physical or virtual machines in your cluster and direct them to send metrics to the sFlow-RT host. The installation steps can be easily automated using orchestration tools like Puppet, Chef, Ansible, etc.

Physical and virtual switches in the cluster can be configured to send sFlow to sFlow-RT in order to add traffic metrics to the mix, exporting metrics that characterizing traffic between service tiers etc. However, in public cloud environments, traffic flow information is typically not available. The articles, Amazon Elastic Compute Cloud (EC2) and Rackspace cloudservers describe how Host sFlow agents can be configured to monitor traffic between virtual machines in the cloud.

Metrics should start appearing in SignalFx as soon as the Host sFlow agents are started.

In this example, sFlow-RT is exporting 5 metrics to summarize the cluster performance, reducing the total monthly cost of monitoring the 1,000 host cluster to less than $15 per month. Of course there are likely to be more metrics that you will want to track, but the ability to selectively log high value metrics provides a way to control costs and maximize benefits.

If you are managing physical infrastructure then sFlow provides a simple way to incorporate network telemetry. For example, add the following metrics to the script to summarize network health:

max:ifinutilization
max:ifoututilization
sum:ifindiscards
sum:ifinerrors
sum:ifoutdiscards
sum:ifouterrors

A network connecting 1,000 physical hosts would have considerably more than 1,000 switch ports and summarizing the per port statistics greatly reduces the cost of monitoring the network. For a catalog of network, host, and application metrics, see Metrics.

Friday, November 13, 2015

SC15 live real-time weathermap

Connect to http://inmon.sc15.org/sflow-rt/app/sc15-weather/html/ between now and November 19th to see a real-time heat map of the The International Conference for High Performance Computing, Networking, Storage and Analysis (SC15) network.

From the SCinet web page, "SCinet brings to life a very high-capacity network that supports the revolutionary applications and experiments that are a hallmark of the SC conference. SCinet will link the convention center to research and commercial networks around the world. In doing so, SCinet serves as the platform for exhibitors to demonstrate the advanced computing resources of their home institutions and elsewhere by supporting a wide variety of bandwidth-driven applications including supercomputing and cloud computing."

The real-time weathermap leverages industry standard sFlow instrumentation built into network switch and router hardware to provide scaleable monitoring of the over 6 Terrabit/s aggregate link capacity comprising the SCinet network. Link colors are updated every second to reflect operational status and utilization of each link.

Clicking on a link in the map pops up a 1 second resolution strip chart showing the protocol mix carried by the link.

The SCinet real-time weathermap was constructed using open source components running on the sFlow-RT real-time analytics engine. Download sFlow-RT and see what you can build.

Update December 1, 2015 The source code is now available on GitHub

Monday, September 28, 2015

Real-time analytics and control applications

sFlow-RT 2.0 released - adds application support describes a new application framework for sharing solutions built on top of the real-time analytics platform. Application examples are provided on the sFlow-RT Download page.

The flow-graph application, shown above, generates a real-time graph of communication between hosts. The application uses a simple sFlow-RT script to track associations between hosts based on their communication patterns and plots the results using the vis.js dynamic, browser based visualization library. This example can be modified to track different types of relationship and extended to incorporate other popular data visualization libraries such as D3.js.

The dashboard-example includes representative real-time metric and top flows trend charts. The example uses the jQuery-UI library to build build a simple tabbed interface. This example can be extended to build groups of custom charts.

The top-flows application supports the definition of custom flows and tracks the largest flows in a continuously updating table.

Each of the examples has a server-side component that uses sFlow-RT's script API to collect, analyze, and export measurements. An HTML5 client side user interface connects to the server and presents the data.

The sFlow-RT analytics engine is a highly scaleable platform for processing sFlow measurements from physical and virtual network switches, servers, virtual machines, Linux containers, load balancers, web and application servers, etc. The analytics capability can be applied to a wide range of SDN and DevOps use cases - many of which have been described on this blog. Application support provides a simple way for vendors, researchers, and developers to distribute solutions.

Thursday, February 5, 2015

Cloud analytics

Librato is an example of a cloud based analytics service (now part of SolarWinds). Librato provides an easy to use REST API for pushing metrics into their cloud service. The web portal makes it simple to combine and trend data and build and share dashboards.

This article describes a proof of concept demonstrating how Librato's cloud service can be used to cost effectively monitor large scale cloud infrastructure by leveraging standard sFlow instrumentation. Librato offers a free 30 day trial, making it easy to evaluate solutions based on this demonstration.

The diagram shows the measurement pipeline. Standard sFlow measurements from hosts, hypervisors, virtual machines, containers, load balancers, web servers and network switches stream to the sFlow-RT real-time analytics engine. Metrics are pushed from sFlow-RT to Librato using the REST API.

Over 40 vendors implement the sFlow standard and compatible products are listed on sFlow.org. The open source Host sFlow agent exports standard sFlow metrics from hosts. For additional background, the Velocity conference talk provides an introduction to sFlow and case study from a large social networking site.

Librato's service is priced based on the number of data points that they need to store. For example, a Host sFlow agent reports approximately 50 measurements per node. Collecting all the measurements from a cluster of 100 servers would generate 5000 metrics and cost $1,000 per month if metrics are stored at 15 second intervals.

There are important scaleability and cost advantages to placing the sFlow-RT analytics engine in front of the metrics collection service. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines are frequently added and removed. Instead, sFlow-RT tracks all the members of the pool, calculates summary statistics for the pool, and logs the summary statistics. This pre-processing can significantly reduce storage requirements, reducing costs and increasing query performance. The sFlow-RT analytics software also calculates traffic flow metrics, hot/missed Memcache keys, top URLs, exports events via syslog to Splunk, Logstash etc. and provides access to detailed metrics through its REST API.

The following steps were involved in setting up the proof of concept.

First register for free trial at Librato.com.

Find or build a server with Java 1.7+ and install sFlow-RT:

wget http://www.inmon.com/products/sFlow-RT/sflow-rt.tar.gz
tar -xvzf sflow-rt.tar.gz
cd sflow-rt

Edit the init.js script and add the following lines (modifying the user and token from your Librato account):

var url = "https://metrics-api.librato.com/v1/metrics";
var user = "first.last@mycompany.com";
var token = "55add91c806fb5f634ad1a334789a32e8d10a597815e6865aa84f0749324450e";

setIntervalHandler(function() {
  var metrics = ['min:load_one','q1:load_one','med:load_one',
                 'q3:load_one','max:load_one'];
  var vals = metric('ALL',metrics,{os_name:['linux']});
  var gauges = {};
  for each (var val in vals) {
     gauges[val.metricName] = {
       "value": val.metricValue,
       "source": "Linux_Pool"
     };
  }
  var body = {"gauges":gauges};
  http(url,'post', 'application/json', JSON.stringify(body), user, token);
} , 15);

Now start sFlow-RT:

./start.sh

Cluster performance metrics describes the summary metrics that sFlow-RT can calculate. In this case, the load average minimum, maximum, and quartiles for the cluster are being calculated and pushed to Librato every 15 seconds.

Install Host sFlow agents on the physical or virtual machines in your cluster and direct them to send metrics to the sFlow-RT host. The installation steps can be easily automated using orchestration tools like Puppet, Chef, Ansible, etc.

Physical and virtual switches in the cluster can be configured to send sFlow to sFlow-RT in order to add traffic metrics to the mix, exporting metrics that characterizing traffic between service tiers etc. However, in public cloud environments, traffic flow information is typically not available. The articles, Amazon Elastic Compute Cloud (EC2) and Rackspace cloudservers describe how Host sFlow agents can be configured to monitor traffic between virtual machines in the cloud.

Metrics should start appearing in Librato as soon as the Host sFlow agents are started.

In this example, sFlow-RT is exporting 5 metrics to summarize the cluster performance, reducing the total monthly cost of monitoring the cluster from $1,000 to $1. Of course there are likely to be more metrics that you will want to track, but the ability to selectively log high value metrics provides a way to control costs and maximize benefits.

Tuesday, December 9, 2014

InfluxDB and Grafana

Cluster performance metrics describes how to use sFlow-RT to calculate metrics and post them to Graphite. This article will describe how to use sFlow with the InfluxDB time series database and Grafana dashboard builder.

The diagram shows the measurement pipeline. Standard sFlow measurements from hosts, hypervisors, virtual machines, containers, load balancers, web servers and network switches stream to the sFlow-RT real-time analytics engine. Over 40 vendors implement the sFlow standard and compatible products are listed on sFlow.org. The open source Host sFlow agent exports standard sFlow metrics from hosts. For additional background, the Velocity conference talk provides an introduction to sFlow and case study from a large social networking site.

It is possible to simply convert the raw sFlow metrics into InfluxDB metrics. The sflow2graphite.pl script provides an example that can be modified to support InfluxDB's native format, or used unmodified with the InfluxDB Graphite input plugin. However, there are scaleability advantages to placing the sFlow-RT analytics engine in front of the time series database. For example, in large scale cloud environments the metrics for each member of a dynamic pool isn't necessarily worth trending since virtual machines are frequently added and removed. Instead, sFlow-RT tracks all the members of the pool, calculates summary statistics for the pool, and logs the summary statistics to the time series database. This pre-processing can significantly reduce storage requirements, reducing costs and increasing query performance. The sFlow-RT analytics software also calculates traffic flow metrics, hot/missed Memcache keys, top URLs, exports events via syslog to Splunk, Logstash etc. and provides access to detailed metrics through its REST API.

First install InfluxDB - in this case the software has been installed on host 10.0.0.30.

Next install sFlow-RT:

wget http://www.inmon.com/products/sFlow-RT/sflow-rt.tar.gz
tar -xvzf sflow-rt.tar.gz
cd sflow-rt

Edit the init.js script and add the following lines (modifying the dbURL to send metrics to the InfluxDB instance):

var dbURL = "http://10.0.0.30:8086/db/inmon/series?u=root&p=root";

setIntervalHandler(function() {
  var metrics = ['min:load_one','q1:load_one','med:load_one',
                 'q3:load_one','max:load_one'];
  var vals = metric('ALL',metrics,{os_name:['linux']});
  var body = [];
  for each (var val in vals) {
     body.push({name:val.metricName,columns:['val'],points:[[val.metricValue]]});
  }
  http(dbURL,'post', 'application/json', JSON.stringify(body));
} , 15);

Now start sFlow-RT:

./start.sh

The script makes an sFlow-RT metrics() query every 15 seconds and posts the results to InfluxDB.

The screen capture shows InfluxDB's SQL like query language and a basic query demonstrating that the metrics are being logged in the database. However, the web interface is rudimentary and a dashboard builder simplifies querying and presentation of the time series data.

Grafana is a powerful HTML 5 dashboard building tool that supports InfluxDB, Graphite, and OpenTSDB.

The screen shot shows the Grafana query builder, offering simple drop down menus that make it easy to build complex charts. The resulting chart, shown below, can be combined with additional charts to build a custom dashboard.

The sFlow standard delivers the comprehensive instrumentation of data center infrastructure and is easily integrated with DevOps tools - see Visibility and the software defined data center

Update January 31, 2016:

The InfluxDB REST API changed with version 0.9 and the above sFlow-RT script will no longer work. The new API is described in Creating a database using the HTTP API. The following version of the script has been updated to use the new API:

var dbURL = "http://10.0.0.30:8086/write?db=mydb";

setIntervalHandler(function() {
  var metrics = ['min:load_one','q1:load_one','med:load_one',
                 'q3:load_one','max:load_one'];
  var vals = metric('ALL',metrics,{os_name:['linux']});
  var body = [];
  for each (var val in vals) {
     body.push(val.metricName.replace(/[^a-zA-Z0-9_]/g,'_') + ' value=' + val.metricValue);
  }
  try { http(dbURL,'post', 'text/plain', body.join('\n')); }
  catch(e) { logWarning('http error ' + e); }
} , 15);

Update April 27, 2016

The sFlow-RT software no longer ships with an init.js file.

Instead, create an influxdb.js file in the sFlow-RT home directory and add the JavaScript code. Next, edit the start.sh file to add a script.file=influxdb.js option, i.e.

RT_OPTS="-Dscript.file=influxdb.js -Dsflow.port=6343 -Dhttp.port=8008"

The script should be loaded when sFlow-RT is started.