Thursday, August 30, 2018

Troubleshooting Dial-up T1 Lines

I had an interesting trouble ticket land in my lap a few years ago.  My employer at the time was one of the few service providers still using various and sundry Cisco AS5300 routers to provide dial-up (!) Internet service to customers.  In one location where we had one of these AS5300 routers, the CO tech was notified that his telephone switch was seeing "Remote Made Busy" alarms from my AS5300, and after some initial troubleshooting, he escalated the ticket to me to investigate from the router side.

Unfortunately, when I logged in to the router, I found nothing wrong:
as2.blah#sho run | begin controller
controller T1 0
framing esf
clock source line primary
linecode b8zs
cablelength short 133
ds0-group 0 timeslots 1-24 type e&m-fgb dtmf dnis
description HC 09201 tg#ISP2 trk 1-24, DTC 00-07, #xxx-1005
!
as2.blah#sho controller t1 0
T1 0 is up.
  Applique type is Channelized T1
  Cablelength is short 133
  Description: HC 09201 tg#ISP2 trk 1-24, DTC 00-07, #xxx-1005
  No alarms detected.
  alarm-trigger is not set
  Version info of slot 0:  HW: 1, PLD Rev: 11
  Framer Version: 0x8
<...snip...>
  Total Data (last 24 hours)
     1 Line Code Violations, 1 Path Code Violations,
     0 Slip Secs, 0 Fr Loss Secs, 1 Line Err Secs, 1 Degraded Mins,
     1 Errored Secs, 0 Bursty Err Secs, 0 Severely Err Secs, 0 Unavail Secs
as2.blah#sho caller ip
  Line           User       IP Address      Local Number    Remote Number   <->
as2.blah#

You can manually busy-out a trunk, as shown on Controller T1 2:
as2.blah#sho run | begin ontroller
<...snip...> controller T1 2
framing esf
clock source line secondary 2
linecode b8zs
cablelength short 133
ds0-group 0 timeslots 1-24 type e&m-fgb dtmf dnis
ds0 busyout 1-24 soft
description 45.ISP.001119..8901 tg#ISP2 trk 49-72, DTC 04-01, #xxx-4108/xxx-1199
!
<...snip...>

See the line that says, "ds0 busyout 1-24 soft?"  That tells the router to busy-out (disable, but only once an individual DS-0 goes inactive) the individual voice channels inside the T1.  However, that line didn't exist on Controller T1 0, so no-one had intentionally busied-out the trunk.

Once I had verified that there was nothing obviously wrong with the T1, I bounced the T1 line by running a shut/no shut on Controller T1 0.  No change.  Then, I rebooted the router.  Again, no change.  I called the CO tech, who confirmed that he was still seeing the "Remote Made Busy" alarm on the T1, meaning that from his equipment's perspective, my router had busied-out the individual lines on the T1.

Eventually, I called a co-worker of mine who had been a Cisco AS5x00 guru back in the day, who showed me another troubleshooting tip:
as2.blah# sho controllers t1 0 call-counters
T1 0:
  DS0's Active: 0
  DS0's Active High Water Mark: 2
  TimeSlot   Type   TotalCalls   TotalDuration
      1       cas           6       00:36:48
      2       cas           7       01:19:29
      3       cas           7       00:24:16
      4       cas           7       00:30:35
      5       cas           7       00:15:49
      6       cas           6       02:33:36
      7       cas           7       03:06:59
      8       cas           7       00:23:25
      9       cas           7       03:01:43
     10       cas           5       04:03:10
     11       cas           6       00:38:36
     12       cas           7       01:08:50
     13       cas           5       05:33:33
     14       cas           6       01:36:16
     15       cas           5       00:16:07
     16       cas           6       01:06:34
     17       cas           5       01:06:48
     18       cas           5       00:09:15
     19       cas           6       00:05:20
     20       cas           6       02:12:24
     21       cas           6       01:25:18
     22       cas           5       00:27:50
     23       cas           5       00:42:23
     24       cas           6       01:47:45

Total DS0's Active High Water Mark: 3
Total Calls since System Bootup: 178
as2.blah#

Ideally, under the "TotalCalls" column, we would see an even distribution of calls -- that is, each individual timeslot in the T1 trunk would have approximately the same number of received calls -- and in fact, in this case, the distribution turns out to be pretty even, with between 5 and 7 calls on each DS-0 (controller T1 1 looks even better with almost exactly six calls per DS-0).  Also, the last column, "TotalDuration," shouldn't show any unusually low counts, where "unusual" is determined entirely by context.  In this case, the router had been rebooted recently, so fairly low numbers for call duration were to be expected.  However, if most of the individual timeslots had total call durations of 20-30 hours, and one (or two, or...) timeslots had call durations of, say, 30 minutes, then that's a pretty good indication of a problem on that DS-0, especially if the router had not been rebooted in quite a while (the longer it has been running, the more even the call duration distribution should be).

Eventually, the engineer I called agreed with my assessment: there did not appear to be anything wrong with the router or the T1 lines.  Our best guess was that, at some point in the last ten years or so since this router had been installed, our documentation in the controller description had diverged from what was actually plugged in to the router, meaning that controller T1 0 was not the one we really should have been troubleshooting.  Unfortunately, by the time I got that far with the troubleshooting process, the problem had mysteriously corrected itself, and as a result, I didn't get a chance to verify the controller descriptions.  That's a bit of a mixed blessing.  To the engineer in me, it was disappointing not to have found a definitive cause of the problem, but at least everything was working properly once again.

No comments:

Post a Comment