This content is also available at learn.palantir.com ↗ and is presented here for accessibility purposes.
You should now see Branch: test-branch at the top of your workbook. By clicking on the time_by_carrier
visualization node, you will see the code appear in the bottom left of your screen. The visualization is currently displaying a plot of the number of minutes spent in flight for the 10 airline carriers that have the largest total number of minutes in flight. We will now update the visualization to show a plot of the minutes spent in flight for the different airline groups, for the 10 routes that have the most total minutes spent in flight.
def time_by_carrier(us_freight_flights):
tf_pdf = us_freight_flights.withColumn('route', F.concat(F.col("origin"), F.lit("-"), F.col("dest"))).toPandas()
tf_grouped = tf_pdf.groupby('route')['actual_elapsed_time'].sum().to_frame().reset_index()
# filter to top 10 carriers
tf_grouped = tf_grouped.sort_values(by=['actual_elapsed_time'], ascending=False).head(10)
# plot bar chart
tf_grouped.plot.bar(x='route', y='actual_elapsed_time')
plt.tight_layout()
plt.xticks(rotation=70)
plt.show()
#return the original dataframe
return us_freight_flights
time_by_carrier
code with the block above.