I love IT. Here's where I write about it. Mainly about packet communications, but some purely philosophical posts as well.
Wednesday, December 19, 2012
20121219 Why Weialgo? Essay #2
Another email about my newly bad home Internet connection. Having network problems is fun (for a short while, I always have my 4G phone if I -really- need something, so this isn't that big of a deal at the moment).
subject: the ping graphs for this morning
Rob Luce <moo@moo.com>
5:05 AM (2 minutes ago)
to Todd
Not able to sleep as usual, so here's some graphs. Yes, they're huge, but, just roll with it for a minute.
It's the same information, weialgo at 2 second intervals, pingplotter at 100ms intervals, otherwise it's the same info. What I'm trying to show is that the views are very similar, it's just a matter of presentation.
Weialgo is going to display information based on attempting to visually give a representation of the impact of variation. Pingplotter is displaying simple absolute latency.
Ping plotter will show a 300ms rtt latency as 300ms. Weialgo will show 300ms rtt latency, if that's the BEST latency for the site, as 0ms. That's important because at a certain level, we can't do anything about the size of the earth. If the network between Chicago and Singapore has zero congestion, and the best we'll ever get going from here to there and back is 300ms, then that's what it is. A network should start with realistic expectations, and one of those is, photons and electrons have speed limits. So, minimum latency is where weialgo starts and it shows it as 0ms.
50ms jitter will show up at 350ms on pingplotter, it'll show up as a green 50ms spike on weialgo.
100ms jitter will show up at 400ms on pingplotter, it'll show up as a yellow 400ms spike on weialgo. Above 50ms jitter on weialgo, it switches from relative to absolute, to demonstrate the overall effect. Sloppy packet handling (jitter) will have a bigger impact on a site that's far away than a site that's close by. Weialgo visualizes that, it was intended to.
200ms jitter will show up at 500ms on pingplotter, it'll show up as a red 500ms spike on weialgo. Anything above 450ms absolute rtt on weialgo goes red, as when you reach the half second mark, human beings start perceiving the latency in applications.
And a dropped packet is a red strike in both of them.
Looking at the graphs this morning, I wanted to remind/reinforce what I've already explained before. It's useful, and I think you'll need it.
Weialgo is a modeled graph, pingplotter is an simple absolute graph. Pingplotter is probably more useful to engineers, but weialgo is more useful to represent the network to the lay person. Pingplotter has to be translated to people, weialgo attempts to explain why people may be complaining based on what their expectations should be.
For those of you that know Todd, you can now feel very sorry for him, as this is the type of stuff he's had to deal with for almost 20 years now. 24x7 non-stop engineering-ish usually-IT stuff. IT is my job, my hobby, and my life. I imagine it can be a little rough to be around me for people who don't take their contribution to the world seriously. But, I suppose normal human beings like to do things that are easy and soft and safe. Golf, or bowling, or maybe watching football, or stuff like that I suppose. Todd is exceptionally gifted, but I guess he has a streak of normal in him.
I've learned to live with the fact that normal people aren't driven to try to do as much as they can. Yes, I know that will sound funny if you know me. But, think about it for a bit. I don't do easy or soft or safe. There was a time where I couldn't be stopped, so I think God stopped me for me, although I've never thanked him for it.
Anyways...
Here's why weialgo. Weialgo tries to show the network to people, or why the network seems to change moment by moment throughout the day. If the packet handling is sloppy, jitter is going to be all over the place. But base latency isn't something we can control, the distance between point A and B will always be fixed, and there will be incurred latency due to that overall round trip. So, an absolute graph of latency isn't really appropriate for showing the quality of the network between A and B. Starting the graph at the minimum observed latency is how it should be shown. I can't do anything about the fact that the end user is in South Africa, and the server is in Canada. What I can do is make sure that the expectation for the network response time of that server doesn't change. That's the key to all of this.
Now, there has to be a realization that if we are sloppy in handing the packets, overall round trip time (rtt) will bite us. Think of jitter as a dog that bites, and the base rtt as the dog's teeth. Some networks are going to have long base rtt, so you need to pay much better attention to how you handle the packets. If a dog is toothless, ehhh... who cares if he gums you a little bit. But, if that dog has REALLY big, razor sharp teeth (Server in New York, the user is in India) then you REALLY need to pay attention or you'll end up without a leg. Maybe a bit too visual of an analogy, but do you get it now?
In networking, it's popular to think that you only really need bandwidth graphs to be able to understand how to provision your network. That's just plain false. Knowing bandwidth utilization is important (95th percentile, 70%+ utilization over time with the expectation of that continuing into the future, upgrade the site), but it wont tell you what people are observing from the network at the site by itself. You need bandwidth graphs and ping (rtt) graphs if you want to know how useful the network is to the people using it.
Normal ping graphs are absolute and need to be translated to non-network engineers. Weialgo is a relative scale model and is designed to not need translation. Anyone should be able to look at a weialgo graph and say "this looks good", or "this looks bad". Because, it's the variation that matters at a point, not the absolute round trip time.
Well, rtt *does* matter, but I live in the real world. Base rtt is a fact of life. No one has invented a faster photon/electron, and moving everyone on the planet onto a small island so that base rtt never exceeds 10ms is a terrible idea. So, distance is the rule, not the exception, and must be factored into everything IT, app design, networking management, and setting end user expectations.
But, because base rtt combined with jitter has a multiplicative effect on the performance of a interactive application, its impact needs to be represented in the graph. Absolute rtt graphs do not do that, and non-network engineers don't understand this, nor do they get the idea when looking at a absolute rtt graph.
This point deserves some emphasis. Take 3 identical server/client systems, one with 0ms base latency and 300ms jitter, one with 150ms base latency and 150ms jitter, and one with 300ms latency and 0ms jitter. The 150/150 system will have the worst interactive performance. I should detail this in a separate blog post some other time, but I'll stand on this statement at this time. I have alot of experience with this particular issue of interactive applications over Internet connectivity, jitter by itself is bad, latency by iself is bad, but combine the two and people are out to shoot you because their Citrix sessions wont stay connected or because you wiped the raid.
Personally, I like packeteers in dq mode for doing this type of testing, but Linux has a similar latency simulation capability (since I always have a couple packeteers around, I have only used the Linux method once, you'll have to google it). Prove this to yourself. Take your favorite interactive app and start messing with different combinations of base latency + jitter = some set amount. The reasoning for all of this has to do with how TCP works, but, you don't have to go that far into it. Seeing it will be enough to get the idea.
The weialgo graphs attempt to take this fact and put it into a visual form that any lay person should be able to see and understand. That's why weialgo and the format of the weialgo graphs.
Hopefully, my next article will be on the weialgo graphing scripts and reporting process.
Rob
Subscribe to:
Post Comments (Atom)


No comments:
Post a Comment