API SNCF + Power BI Real Time: better together
Updated: May 17, 2020
As we rush, as we rush in the Train,
The trees and the houses go wheeling back,
But the starry heavens above the plain
Come flying on our track.
(from In The Train by James Thomson)
Power BI offers an excellent opportunity to push your data in real-time to your dashboard and see the result in the form of constantly changing visuals showing your KPIs moving up and down. On the other hand you have got SNCF API, allowing to get all the train data (departure, destination, number, headsign etc) from any train station in France in a near-real time mode (20 request per minute). So why not combining them together and making a beautiful service with only few lines of code and some basic manipulation in Power BI. The idea is to visualize a trains head sign and its destination. Sounds simple, right? Up we go!
This post is partly inspired by an excellent tutorial provided by BharatiDWConsultancy YouTube channel (see Power BI Streaming Real Time Dashboards in 10 minutes - DIY -40-of-50 https://www.youtube.com/watch?v=dXpFciYOVsc). Thank you, guys, you are doing awesome things!
So, creating a real-time dashboard is relatively simple. If you follow the steps in the tutorial, you will understand how it works. You first define your json input (‘Values from stream’ field) and obtain your power bi API url. You then define this json input in your python code and send it to power bi. Finally, in dashboard menu, you need to create some visuals for your KPIs. However, the list of visuals VERY limited. There are only several flow charts, histograms and cards (which in turn do not support string data), but we’ll discuss this later.
First, we go to our power bi dashboard, click on add new tile and create a custom streaming data.
You then create the fields you want to be updated in a real-time and save the sample json somewhere on your local machine (this will be needed to compare your python output to this sample).
What actually happens is that you are creating the data structure that Power BI will store into a temporary cache, which quickly expires. This means that there is no underlying database, so you cannot build report visuals using the data that flows in from the stream. As such, you cannot make use of report functionality such as filtering, custom visuals, and other report functions (https://docs.microsoft.com/en-us/power-bi/service-real-time-streaming).
Again, the tutorial mentioned above gives an excellent explanation of how to create a simple real-time dashboard using random data generated by your python script. In our case, we are interested in creating a service based on real data, so we are going to create our SNCF API account and call the API from our python code. Important note: the list of fields we have created when we followed the tutorial is "headsign", "counter", "departure" (int, int, string).
According to the official information (https://www.digital.sncf.com/faq/api):
L’utilisation de l’API nécessite une clé d’authentification (« token »). Pour obtenir votre clé d'authentification et utiliser l'API SNCF, vous devez préalablement vous inscrire au service grâce au formulaire d'inscription accessible à l’adresse : https://www.digital.sncf.com/startup/api/token-developpeur
Which means that we need to create an account and the sncf team will send us the key by mail. Well, let’s do this.
After you have created an account ( https://www.digital.sncf.com/startup/api/token-developpeur) you should have received the following message:
Merci de vous être inscrit à l’API SNCF. Voici votre clé d’authentification :
Comment vous authentifier à l'API ? Accédez à l'API : https://api.sncf.com/v1 Username : copiez puis collez votre clé Password : laissez le champ vide
ou copiez votre clé directement dans l'URL.
So we open this url in our browser and paste our token in the username field (we put nothing in the password field).
Let’s give it a try…
Great, we now have access to our real-time data!
The only thing we must do is to parse this json output.
Here is the complete code:
import requests import time import json import random import pandas as pd # Authentication token api_auth = 'your-api-goes_here' # counter counter = 0 # Power BI Real Time API # Replace by your own REST_API_URL = 'https://api.powerbi.com/beta/6494460e-8600-4edc-850f-528e8faad290/datasets/288527d3-eb81-4f17-9df1-dd7270163b86/rows?key=jOtLG%2FGwRo6xcu%2B%2FkyFVn4rU9Z0fbTlNq0CQp2%2FNGB5UtLiG54TJQRauUY26chuRRpOF9d1cJTXwDlmL2MA3Rw%3D%3D' # next departures from Montparnasse request_url = 'https://api.sncf.com/v1/coverage/sncf/stop_areas/stop_area:OCE:SA:87391003/departures?' \ 'datetime=20180912T103450' # endless loop to call the api every 5 seconds def data_generation(counter=None): # call the SNCF Open API # # auth = ('username', 'password') request_result = requests.get(request_url, auth=(api_auth, '')) # json output : context context = request_result.json()['context'] # context sub-node: current date time cur_date = context['current_datetime'] # json_output : departures departures = request_result.json()['departures'] departure = departures['display_informations']['direction'] # headsign of the fisrt train headsign = int(departures['display_informations']['headsign']) return [headsign, counter, departure] while True: data_raw =  # simple counter increment counter += 1 for i in range(1): row = data_generation(counter) data_raw.append(row) # set the header record HEADER = ["headsign", "counter", "departure"] # generate a temp data frame to convert it to json data_df = pd.DataFrame(data_raw, columns=HEADER) # prepare date for post request (to be sent to Power BI) data_json = bytes(data_df.to_json(orient='records'), encoding='utf-8') # Post the data on the Power BI API req = requests.post(REST_API_URL, data_json) print("Data posted in Power BI API") print(data_json) # wait 5 seconds time.sleep(4)
We now can visualize this data in our Power BI dashboard. However, we still can’t find a way to visualize the destination. Fortunately, Benjamin Crochard, our Power BI expert, has found a workaround: simply use a clustered bar chart with the destination as a legend and counter as a bar value. Well, even if it is not a ‘true’ solution, it definitely allows to bypass this limitation.
Now it is up to you to adapt these two solution to your needs and make our everyday lives little better.