• Alibek Jakupov

API SNCF + Power BI Real Time: better together

Updated: May 17


As we rush, as we rush in the Train,

The trees and the houses go wheeling back,

But the starry heavens above the plain

Come flying on our track.

(from In The Train by James Thomson)

Power BI offers an excellent opportunity to push your data in real-time to your dashboard and see the result in the form of constantly changing visuals showing your KPIs moving up and down. On the other hand you have got SNCF API, allowing to get all the train data (departure, destination, number, headsign etc) from any train station in France in a near-real time mode (20 request per minute). So why not combining them together and making a beautiful service with only few lines of code and some basic manipulation in Power BI. The idea is to visualize a trains head sign and its destination. Sounds simple, right? Up we go!


This post is partly inspired by an excellent tutorial provided by BharatiDWConsultancy YouTube channel (see Power BI Streaming Real Time Dashboards in 10 minutes - DIY -40-of-50 https://www.youtube.com/watch?v=dXpFciYOVsc). Thank you, guys, you are doing awesome things!


So, creating a real-time dashboard is relatively simple. If you follow the steps in the tutorial, you will understand how it works. You first define your json input (‘Values from stream’ field) and obtain your power bi API url. You then define this json input in your python code and send it to power bi. Finally, in dashboard menu, you need to create some visuals for your KPIs. However, the list of visuals VERY limited. There are only several flow charts, histograms and cards (which in turn do not support string data), but we’ll discuss this later.

First, we go to our power bi dashboard, click on add new tile and create a custom streaming data.


You then create the fields you want to be updated in a real-time and save the sample json somewhere on your local machine (this will be needed to compare your python output to this sample).

What actually happens is that you are creating the data structure that Power BI will store into a temporary cache, which quickly expires. This means that there is no underlying database, so you cannot build report visuals using the data that flows in from the stream. As such, you cannot make use of report functionality such as filtering, custom visuals, and other report functions (https://docs.microsoft.com/en-us/power-bi/service-real-time-streaming).

Again, the tutorial mentioned above gives an excellent explanation of how to create a simple real-time dashboard using random data generated by your python script. In our case, we are interested in creating a service based on real data, so we are going to create our SNCF API account and call the API from our python code. Important note: the list of fields we have created when we followed the tutorial is "headsign", "counter", "departure" (int, int, string).

According to the official information (https://www.digital.sncf.com/faq/api):

L’utilisation de l’API nécessite une clé d’authentification (« token »). Pour obtenir votre clé d'authentification et utiliser l'API SNCF, vous devez préalablement vous inscrire au service grâce au formulaire d'inscription accessible à l’adresse : https://www.digital.sncf.com/startup/api/token-developpeur

Which means that we need to create an account and the sncf team will send us the key by mail. Well, let’s do this.

After you have created an account ( https://www.digital.sncf.com/startup/api/token-developpeur) you should have received the following message:

Bonjour User,

Merci de vous être inscrit à l’API SNCF. Voici votre clé d’authentification :

<your-token>

Comment vous authentifier à l'API ? Accédez à l'API : https://api.sncf.com/v1 Username : copiez puis collez votre clé Password : laissez le champ vide

ou copiez votre clé directement dans l'URL.

So we open this url in our browser and paste our token in the username field (we put nothing in the password field).


Let’s give it a try…


Great, we now have access to our real-time data!

The only thing we must do is to parse this json output.

Here is the complete code:


import requests
import time
import json
import random
import pandas as pd

# Authentication token
api_auth = 'your-api-goes_here'
# counter
counter = 0
# Power BI Real Time API
# Replace by your own
REST_API_URL = 'https://api.powerbi.com/beta/6494460e-8600-4edc-850f-528e8faad290/datasets/288527d3-eb81-4f17-9df1-dd7270163b86/rows?key=jOtLG%2FGwRo6xcu%2B%2FkyFVn4rU9Z0fbTlNq0CQp2%2FNGB5UtLiG54TJQRauUY26chuRRpOF9d1cJTXwDlmL2MA3Rw%3D%3D'
# next departures from Montparnasse
request_url = 'https://api.sncf.com/v1/coverage/sncf/stop_areas/stop_area:OCE:SA:87391003/departures?' \
 'datetime=20180912T103450'
# endless loop to call the api every 5 seconds

def data_generation(counter=None):
 # call the SNCF Open API
 # # auth = ('username', 'password')
    request_result = requests.get(request_url, auth=(api_auth, ''))
 # json output : context
    context = request_result.json()['context']
 # context sub-node: current date time
    cur_date = context['current_datetime']
 # json_output : departures
    departures = request_result.json()['departures']
    departure = departures[0]['display_informations']['direction']
 # headsign of the fisrt train
    headsign = int(departures[0]['display_informations']['headsign'])
 
 return [headsign, counter, departure]


while True:
    data_raw = []
 # simple counter increment
    counter += 1
 for i in range(1):
        row = data_generation(counter)
        data_raw.append(row)
 
 set the header record
    HEADER = ["headsign", "counter", "departure"]
 # generate a temp data frame to convert it to json
    data_df = pd.DataFrame(data_raw, columns=HEADER)
 # prepare date for post request (to be sent to Power BI)
    data_json = bytes(data_df.to_json(orient='records'), encoding='utf-8')
 # Post the data on the Power BI API
    req = requests.post(REST_API_URL, data_json)
    print("Data posted in Power BI API")
    print(data_json)
 # wait 5 seconds
    time.sleep(4)

We now can visualize this data in our Power BI dashboard. However, we still can’t find a way to visualize the destination. Fortunately, Benjamin Crochard, our Power BI expert, has found a workaround: simply use a clustered bar chart with the destination as a legend and counter as a bar value. Well, even if it is not a ‘true’ solution, it definitely allows to bypass this limitation.

Now it is up to you to adapt these two solution to your needs and make our everyday lives little better.

©2018 by macnabbs. Proudly created with Wix.com