Search data
Search macroeconomic data
- pynsee.macrodata.get_dataset_list(update=False, silent=False)
Download a full INSEE’s datasets list from BDM macroeconomic database
- Parameters:
update (bool, optional) – Set to True, to update manually the metadata
False. (stored locally on the computer. Defaults to)
silent (bool, optional) – Set to True, to disable messages printed in log info
- Returns:
DataFrame – a dataframe containing the list of datasets available
Examples
>>> from pynsee.macrodata import get_dataset_list >>> insee_dataset = get_dataset_list()
- pynsee.macrodata.get_series_list(*datasets, update=False, silent=False)
Download an INSEE’s series key list for one or several datasets from BDM macroeconomic database
- Parameters:
datasets (str) – datasets should be among the datasets list provided by get_dataset_list()
update (bool, optional) – Set to True, to update manually the metadata
False. (stored locally on the computer. Defaults to)
silent (bool, optional) – Set to True, to disable messages printed in log info
- Raises:
ValueError – datasets should be among the datasets list provided by get_dataset_list()
- Returns:
DataFrame – contains dimension columns, series keys, dataset name
Notes
Some metadata is stored for 3 months locally on the computer. It is updated automatically
Examples
>>> from pynsee.macrodata import get_dataset_list, get_series_list >>> dataset_list = get_dataset_list() >>> idbank_ipc = get_series_list('IPC-2015', 'CLIMAT-AFFAIRES')
- pynsee.macrodata.search_macrodata(pattern='.*', metadata=True)
Search a pattern among insee series (idbanks) from BDM macroeconomic database
Notes
This function uses package’s internal data which might not be the most up-to-date.
- Parameters:
pattern (str, optional) – String used to filter the idbank list. Defaults to “.*”, returns all series.
Examples
>>> from pynsee.macrodata import search_macrodata >>> search_all = search_macrodata() >>> search_paper = search_macrodata("pâte à papier") >>> search_paris = search_macrodata("PARIS") >>> search_survey_gdp = search_macrodata("Survey|GDP")
- pynsee.macrodata.get_last_release()
Get the datasets from BDM macroeconomic database released in the last 30 days
- Examples
>>> from pynsee.macrodata import get_last_release >>> dataset_released = get_last_release()
- pynsee.macrodata.get_column_title(dataset=None, update=True)
Get the title of a dataset’s columns
- Parameters:
dataset (str, optional) – An INSEE dataset name. Defaults to None, this returns all columns.
- Raises:
ValueError – Only one string (length one)
ValueError – Dataset must belong to INSEE datasets list
Examples
>>> from pynsee.macrodata import get_column_title >>> insee_all_columns = get_column_title() >>> balance_paiements_columns = get_column_title("BALANCE-PAIEMENTS")
Search geographical data
- pynsee.geodata.get_geodata_list(update=False, silent=False) DataFrame
Get a list of geographical limits of French administrative areas from IGN API
- Parameters:
update (bool, optional) – Trigger an update, otherwise locally saved data is used. Defaults to False.
silent (bool, optional) – Set to True, to disable messages printed in log info
Examples
>>> from pynsee.geodata import get_geodata_list >>> # Get a list of geographical limits of French administrative areas from IGN API >>> geodata_list = get_geodata_list()
Search local data
- pynsee.localdata.get_local_metadata()
Get a list of all combinations of datasets, variables and unit measures available from INSEE Local API
Notes
This function renders only package’s internal data, it might not be the most up-to-date
Examples
>>> from pynsee.localdata import get_local_metadata >>> metadata = get_local_metadata()
- pynsee.localdata.get_nivgeo_list()
Get a list of geographic levels
- Examples
>>> from pynsee.localdata import get_nivgeo_list >>> nivgeo_list = get_nivgeo_list()
- pynsee.localdata.get_geo_list(geo=None, date=None, update=False, silent=False)
Get a list of French geographic areas (communes, departements, regions …)
- Parameters:
geo (str) – choose among : communes, communesDeleguees, communesAssociees, regions, departements, arrondissements, arrondissementsMunicipaux
date (str) – date of validity (AAAA-MM-DD)
update (bool) – locally saved data is used by default. Trigger an update with update=True.
silent (bool, optional) – Set to True, to disable messages printed in log info
- Raises:
ValueError – geo should be among the geographic area list
Examples
>>> from pynsee.localdata.get_geo_list import get_geo_list >>> city_list = get_geo_list('communes') >>> region_list = get_geo_list('regions') >>> departement_list = get_geo_list('departements') >>> arrondiss_list = get_geo_list('arrondissements')
- pynsee.localdata.get_area_list(area=None, date=None, update=False, silent=False) DataFrame
Get an exhaustive list of administrative areas : communes, departments, and urban, employment or functional areas
- Parameters:
area (str, optional) – Defaults to None, then get all values
date (str) – date of validity (AAAA-MM-DD)
update (bool) – locally saved data is used by default. Trigger an update with update=True.
silent (bool, optional) – Set to True, to disable messages printed in log info
- Raises:
ValueError – Error if area is not available
Examples
>>> from pynsee.localdata import get_area_list >>> area_list = get_area_list() >>> # >>> # get list of all communes in France >>> reg = get_area_list(area='regions')
Search metadata
- pynsee.metadata.get_definition_list()
Get a list of concept definitions
Examples
>>> from pynsee.metadata import get_definition_list >>> definition = get_definition_list()
- pynsee.metadata.get_activity_list(level)
Get a list of economic activities from NAF/NACE rev 2 2008 classification
Notes
This function uses NAF/NACE rev. 2 classification made in 2008. This function renders only package’s internal data.
- Parameters:
level (str) – Levels available are : A5, A10, A17, A21, A38, A64, A88, A129, A138, NAF1, NAF2, NAF3, NAF4, NAF5
- Raises:
ValueError – an error is raised if level is not in the default list
Examples
>>> from pynsee.metadata import get_activity_list >>> activity_A138 = get_activity_list('A138') >>> activity_NAF3 = get_activity_list('NAF3') >>> activity_NAF5 = get_activity_list('NAF5')
Search sirene data
- pynsee.sirene.get_dimension_list(kind='siret')
Get a list of all columns useful to make queries with search_sirene
- Parameters:
kind (str, optional) – Choose between siret and siren. Defaults to ‘siret’.
Examples
>>> from pynsee.sirene import get_dimension_list >>> sirene_dimension = get_dimension_list()
- pynsee.sirene.search_sirene(variable, pattern, kind='siret', phonetic_search=False, and_condition=True, upper_case=False, decode=False, number=1000, activity=True, legal=False, closed=False, update=False, silent=False)
Get data about companies from criteria on variables
- Parameters:
variable (str or list) – name of the variable on which the search is applied.
pattern (str or list) – the pattern or criterium searched
kind (str, optional) – kind of companies : siren or siret. Defaults to “siret”
phonetic_search (bool, or list of bool, optional) – If True phonetic search is triggered on the all variables of the list, if it is a list of True/False, phonetic search is used accordingly on the list of variables
and_condition (bool, optional) – If True, only records meeting all conditions are kept (AND is inserted between the conditions). If False, all records meeting at least one condition are kept (OR is inserted between the conditions).
number (int, optional) – Number of companies searched. Defaults to 1000. If it is above 1000, multiple queries are triggered.
upper_case (bool, optional) – If True, values of argument ‘pattern’ are converted to upper case and added to the list of searched patterns.
decode (bool, optional) – If True, values of argument ‘pattern’ are decoded, especially accents are removed and added to the list of searched patterns.
activity (bool, optional) – If True, activty title is added based on NAF/NACE. Defaults to True.
legal (bool, optional) – If True, legal entities title are added
closed (bool, optional) – If False, closed entities are removed from the data and for each legal entity only the last period for which the data is stable is displayed
silent (bool, optional) – Set to True, to disable messages printed in log info
Notes
This function may return personal data, please check and comply with the legal framework relating to personal data protection
Examples
>>> from pynsee.metadata import get_activity_list >>> from pynsee.sirene import search_sirene >>> from pynsee.sirene import get_dimension_list >>> # >>> # Get available column names, it is useful to design your query with search_sirene >>> sirene_dimension = get_dimension_list() >>> # >>> # Get activity list (NAF rev 2) >>> naf5 = get_activity_list('NAF5') >>> # >>> # Get a list of hospitals in Paris >>> df = search_sirene(variable = ["activitePrincipaleUniteLegale", >>> "codePostalEtablissement"], >>> pattern = ["86.10Z", "75*"], kind = "siret") >>> # >>> # Get a list of companies located in Igny city whose name matches with 'pizza' using a phonetic search >>> df = search_sirene(variable = ["libelleCommuneEtablissement", >>> 'denominationUniteLegale'], >>> pattern = ["igny", 'pizza'], >>> phonetic_search=True, kind = "siret") >>> # >>> # Get a list of companies whose name matches with 'SNCF' (French national railway company) >>> # and whose legal status is SAS (societe par actions simplifiee) >>> df = search_sirene(variable=["denominationUniteLegale", >>> 'categorieJuridiqueUniteLegale'], >>> pattern=["sncf", '5710'], kind="siren") >>> # >>> # Get data on Hadrien Leclerc >>> df = search_sirene(variable = ['prenom1UniteLegale', 'nomUniteLegale'], >>> pattern = ['hadrien', 'leclerc'], >>> phonetic_search = [True, False], >>> closed=True) >>> # >>> # Find 2500 tobacco shops >>> df = search_sirene(variable = ['denominationUniteLegale'], >>> pattern = ['tabac'], >>> number = 2500, >>> kind = "siret") >>> # >>> # Find 1000 companies whose name sounds like Dassault Système or is a big company (GE), >>> # search is made as well on patterns whose accents have been removed >>> import os >>> # environment variable 'pynsee_print_url' force the package to print the request >>> os.environ["pynsee_print_url"] = 'True' >>> df = search_sirene(variable = ["denominationUniteLegale", 'categorieEntreprise'], >>> pattern = ['Dassot Système', 'GE'], >>> and_condition = False, >>> upper_case = True, >>> decode = True, >>> update = True, >>> phonetic_search = [True, False], >>> number = 1000)
Search files on insee.fr
- pynsee.download.get_file_list()
Download a list of files available on insee.fr
- Returns:
Returns the requested dataframe as a pandas object
Notes
pynsee.download’s metadata rely on volunteering contributors and their manual updates. get_file_list does not provide data from official Insee’s metadata API. Consequently, please report any issue
Examples
>>> from pynsee.download import get_file_list >>> insee_file_list = get_file_list()