New version of SamReports will support Trixbox and Fonality and Big Datasets

Update: 2011-12-07 : SAM Reports 2 has been launched, as you may see on this blog post.

Accessing sliced log files

I’ve been working on a new version of SamReports for quite some time. My main goal was to provide seamless “integration” with Trixbox/Fonality SKU-s. (It’s not integration really, but the process of copying thousands of files from PBX, gluing them together, processing and so on, has to be transparent to the user, with as little hassle as possible, none preferably).

I came into possession  of a large dataset and that has helped  a lot. You cannot possibly simulate real, raw data to work with. You would just end up with gibberish. That brings the issue of processing..

Big data

Current version of SamReports tries to load all the data in memory, to generate reports. That just doesn’t work for millions of log lines.

The new version is capable of processing any amount of data, because it works in chunks, and generates reports sequentially. I tested on a dataset with more than 10 million lines in CDR, and it finished in 37 minutes.

Generating TrixBox and Fonality Reports

SamReports Parser for TrixBox and Fonality

Above is the image of a summary generated after processing the log files. The graphs are being drawn during the parsing so that you’re not bored while processing a large dataset for the first time :) This is displaying the same charts to get an overview.

Quirks and twists of master.csv (chopped or not)

I did get dirty with the local  and transferred calls in CDR logs. Sometimes a call can be spanned in 5, 6 log lines. Those lines don’t necessarily come close, but can be even 1000 lines apart.

master.csv calls on a local channel

local channels in master.csv

Here you see a call spanned into just 2 lines, one after the other.

To be continued…

This is just a preview of what’s been cooking. There are many other things to mention, but I will do it in another post…

4 Comments

Need a VM of an operational TrixBox/Fonality

Why do I need it?

As I presented in my case, in the previous post, I need a virtual machine of an operational TrixBox PBX (Fonality edition).

  • I need it to test connecting to the MySQL database in order to read the CDR and queue data.
  • I also need it to further improve on my merging of the log files created by the system, and transferring them.

Privacy issues

  • I’m aiming at PBXs that are no longer in use, but have not been refurbished yet.
  • I will obfuscate the data. I made a little program, Randy, that generates random names and phone numbers, just for that purpose. I used it to change all the identifiable data from one of my clients. I’m using that data for demonstrating the features of Sam Reports, the owner of this blog.

What’s in it for you?

In Croatian (Ima’l mene tute?)

5 Comments

About beta and TrixBox-Fonality

The first five days of beta are behind me. Some people subscribed to the Google Group and others corresponded with me only via email. Very useful feedback so far, I’m satisfied with how all is going.

Update 2011-12-07 : SAM Reports 2 supports Trixbox/Fonality without any additional modules. Take a look at our new help files.

TrixBox (Fonality) issues

I would really like to make reading the reports from TrixBox (Fonality) a bit easier. Sam Reports generates Asterisk reports from log files that are present on every Asterisk box by default (master.csv and queue_log). TrixBox slices the log files into a bunch of small files and puts them in  date-named directories . Let me describe it in details:

Master.csv

It is originally just one cvs file located in /var/log/asterisk/cdr-csv/master.csv (I don’t care about the cdr-custom at the moment). On Fonality PBX, (I’m not sure that the situation is the same on all TrixBox SKUs), the master.csv data is placed in the same directory:  /var/log/asterisk/cdr-csv/09-10-01/,  but there are additional directories created there, on a weekly basis. So (the date – 09-10-01 – corresponds to YY-MM-DD). A new directory is created on Monday of each week , as stated in Fonality help files. I have copied all the files from Fonality PBX on my local PC, and below you can see a screenshot:

Fonality (TrixBox) master.csv

Fonality (TrixBox) master.csv

In each of these directories there are thousands of master.csv parts named: Master.csv.xxxxxxxxxx, where the xxxxxxxxxx is a UNIX time stamp (the number of seconds elapsed since Jan 1, 1970). One such file is created every 5 minutes, on a Fonality PBX. The file named just “master.csv“, that you see on the figure above, is the latest log that has not yet been stored in a directory and rotated.

Fonality-TrixBox master.csv parts

Fonality-TrixBox master.csv parts

Queue_log

Originally queue_log file is located in /var/log/asterisk/queue_log.  On Fonality-TrixBox  it is also dissected  into many parts, in much the same way.

Fonality-TrixBox queue_log

Fonality-TrixBox queue_log

The original “queue_log” file just holds the latest 5 minutes worth of information, and all other data is in files created by the same algorithm as CDR files.

Fonality-TrixBox queue_log parts

Fonality-TrixBox queue_log parts

Merging the files

One of my beta testers wanted to try Sam Reports, but got only the 5 minutes worth of data. So I decided to make a little program to merge all the disjointed files into one, master.csv and queue_log respectively. If you ever have such need, you can download the file here: FonalityMerge.

  • It requires that you have copied all the log files and directories from your Fonality box, somewhere on your local disk.
  • Then you merge the files with FonalityMerge
  • And you get your master.csv and queue_log

Now you can process the log files with Sam Reports and see your reports.

Further improvement

If I get enough inquiries for TrixBox/Fonality support I may opt to do one of the things:

  • Make the transfer of directories from Fonality-TrixBox available with a click of a button. That would take into account just the diff between directories and files (the directories and files already present on the local disk would not be transferred any more)
  • Or go for the database option and connect directly to the MySQL database with the logs. To do that I would need a working TrixBox with a database that contains meaningful data.  Meaningful means real data generated by using a system for 5-6 month, at least.

It would be great if someone could provide me with a virtual machine of a TrixBox PBX that has been operational for some time. I know it’s a long shot. I would be veeeery grateful.

5 Comments

Clicky Web Analytics