Provencia¶
Provencia sara.viz example.
>>> from examples.provencia import ProvenciaEnvSchema
>>> import textwrap
>>> print(textwrap.fill(ProvenciaEnvSchema.Config.metadata["description"], width=80))
Créé en 1963 à Annecy, le groupe Provencia est aujourd hui un acteur majeur de
la grande distribution en Rhône Alpes. Cet environnement modélise la commande de
produits pour le rayon boucherie.
Data¶
Online connection with Provencia DataBase
Q-table visualization¶
>>> from examples.provencia import etl_pipeline, env_pipeline, get_store_ids, get_product_codes
>>> from sara.oar import enrich_rtg, discount_from_horizon, bin_with_quantiles
>>> from sara.viz import plot_insight
>>> from pathlib import Path
>>> filename = Path("docs/source/provencia/provencia.png")
>>> store_ids = get_store_ids().index.to_list()
>>> product_codes = get_product_codes(store_ids)
>>> sausage_codes = product_codes.index[
... product_codes['description'].str.contains('CHIPOS MERGUEZ')].to_list()
>>> df = etl_pipeline(store_ids, sausage_codes)
>>> df = env_pipeline(df)
>>> df = enrich_rtg(df, discount_from_horizon(30))
>>> num_quantiles = {
... ('act0', 'ordered'): 10,
... ('obs0', 'stock_after_delivery'): 10}
>>> df = bin_with_quantiles(df, num_quantiles)
>>> _,_,_=plot_insight(
... df,
... col_labels=list(num_quantiles),
... filename=filename)
Interpretation¶
High returns are found for high ordering and high score, indicating that it is important to keep a high level of stock. Also, very high stock levels leading to throwing away are not experienced. Together with the fact that 0 ordering is more often selected, this indicates that Provencia should move toward a more important purchasing policy.
API reference¶
- class examples.provencia.ProvenciaETLSchema(*args, **kwargs)[source]¶
Describe Schema for Provencia ETL dataframe.
- class examples.provencia.ProvenciaEnvSchema(*args, **kwargs)[source]¶
Pandera OAR Schema for provencia supply chain environment.
- examples.provencia.env_pipeline(df_etl: DataFrame, lag_profit: int = 0) DataFrame[OARSchema][source]¶
Transform Provencia ETL dataframe into Env dataframe with environment definition.
- Parameters:
df_etl – dataframe from
etl_pipeline()lag_profit – number of days between ‘total_paid’ (at ‘date’) and ‘gross’ (at ‘date’+lag_profit) in profit calculcation for the reward.
- Returns:
OAR dataframe formatted with environment
- Return type:
DataFrame[OARSchema]
Exemples:
>>> from examples.provencia import etl_pipeline >>> df = etl_pipeline( ... ['JE', 'KV', 'EV'], ... [2870622000000, 2870557000000, 2870549000000]) >>> env_pipeline(df) signal obs0 act0 rew1 key sold_yd gross_yd stock_after_delivery delivered purchase_price ordered profit store_id product_code date EV 2870549000000 2023-12-06 0 0 5 0 647 0 0.0 2023-12-07 0 0 5 0 647 0 0.0 2023-12-08 0 0 5 0 647 0 1834.0 2023-12-09 4 1834 1 0 647 0 0.0 2023-12-10 0 0 1 0 647 0 0.0 ... ... ... ... ... ... ... ... 2870622000000 2024-07-28 0 0 0 0 560 0 0.0 2024-07-29 0 0 0 0 560 0 0.0 2024-07-30 0 0 0 0 560 0 0.0 2024-07-31 0 0 0 0 560 0 0.0 2024-08-01 0 0 0 0 560 1 -1680.0 [515 rows x 7 columns]
- examples.provencia.etl_pipeline(store_ids: list[str], product_codes: list[int]) DataFrame[ProvenciaETLSchema][source]¶
Provencia SQL connector.
- Parameters:
store_ids – the list of store ids
product_codes – the list of product codes
- Returns:
- the Provencia ETL dataframe,
see
ProvenciaETLSchemafor a description of its structure.
- Return type:
DataFrame[ProvenciaETLSchema]
Examples
>>> etl_pipeline( ... ['JE', 'KV', 'EV'], ... [2870622000000, 2870557000000, 2870549000000]) delivered purchase_price ordered total_paid sold gross stock_evening store_id product_code date EV 2870549000000 2023-12-05 4 647 0 0.0 0 0 5 2023-12-06 0 647 0 0.0 0 0 5 2023-12-07 0 647 0 0.0 0 0 5 2023-12-08 0 647 0 0.0 4 1834 1 2023-12-09 0 647 0 0.0 0 0 1 ... ... ... ... ... ... ... ... 2870622000000 2024-07-28 0 560 0 0.0 0 0 0 2024-07-29 0 560 0 0.0 0 0 0 2024-07-30 0 560 0 0.0 0 0 0 2024-07-31 0 560 0 0.0 0 0 0 2024-08-01 0 560 1 1680.0 0 0 0 [517 rows x 7 columns]
Note
product_code=2870626000000 (foie de porc) is problematic in order table
- examples.provencia.get_product_codes(store_ids: list[str]) DataFrame[source]¶
Get all product codes and description in specified shops.
- Parameters:
store_ids – a list of store ids to get product codes from
- Returns:
the product_codes as index, the description as column
- Return type:
pd.DataFrame
Example
>>> get_product_codes(['JE', 'KV', 'EV']) description product_code 2870622000000 *PO-SAUTE X 3KG 2870557000000 PORC FOIE 2870549000000 KG SAUTE PORC FRAN 2477358000000 *EL-ROTI VEAU FARCI 2276357000000 *SA-SAUCI MENAGE X3 ... ... 2477424000000 *VO-CUISSE DE POULE 2443970000000 SAUCISSON TRUFFE 2280328000000 EL-CREPINETTE X 3 2224892000000 GIGOT *** ENTIER S 2610547000000 BLOC DE DINDE RA [519 rows x 1 columns]
- examples.provencia.get_store_ids() DataFrame[source]¶
Get all the store ids and name as DataFrame.
Specifically, the store_ids in public.order table.
- Returns:
the store ids as index, the store name as column
- Return type:
pd.DataFrame
Examples
>>> get_store_ids() name store_id JE Seynod KV Saint-Jeoire-Prieuré EV Faverges BS Annecy-le-Vieux MO Villeurbanne ... ... LW Thonon-les-Bains KM Saint-Jean-de-Moirans FW Gex GF Grésy-sur-Aix EM Douvaine