Search data

Search macroeconomic data

pynsee.macrodata.get_dataset_list()

Download a full INSEE’s datasets list from BDM macroeconomic database

Returns:

DataFrame: a dataframe containing the list of datasets available

Examples:
>>> from pynsee.macrodata import get_dataset_list
>>> insee_dataset = get_dataset_list()
pynsee.macrodata.get_series_list(*datasets, update=False)

Download an INSEE’s series key list for one or several datasets from BDM macroeconomic database

Args:

datasets (str) : datasets should be among the datasets list provided by get_dataset_list()

update (bool, optional): Set to True, to update manually the metadata stored locally on the computer. Defaults to False.

Raises:

ValueError: datasets should be among the datasets list provided by get_dataset_list()

Returns:

DataFrame: contains dimension columns, series keys, dataset name

Notes:

Some metadata is stored for 3 months locally on the computer. It is updated automatically

Examples:
>>> from pynsee.macrodata import get_dataset_list, get_series_list
>>> dataset_list = get_dataset_list()
>>> idbank_ipc = get_series_list('IPC-2015', 'CLIMAT-AFFAIRES')
pynsee.macrodata.search_macrodata(pattern='.*', metadata=True)

Search a pattern among insee series (idbanks) from BDM macroeconomic database

Notes:

This function uses package’s internal data which might not be the most up-to-date.

Args:

pattern (str, optional): String used to filter the idbank list. Defaults to “.*”, returns all series.

Examples:
>>> from pynsee.macrodata import search_macrodata
>>> search_all = search_macrodata()
>>> search_paper = search_macrodata("pâte à papier")
>>> search_paris = search_macrodata("PARIS")
>>> search_survey_gdp = search_macrodata("Survey|GDP")
pynsee.macrodata.get_last_release()

Get the datasets from BDM macroeconomic database released in the last 30 days

Examples
>>> from pynsee.macrodata import get_last_release
>>> dataset_released = get_last_release()
pynsee.macrodata.get_column_title(dataset=None)

Get the title of a dataset’s columns

Args:

dataset (str, optional): An INSEE dataset name. Defaults to None, this returns all columns.

Raises:

ValueError: Only one string (length one) ValueError: Dataset must belong to INSEE datasets list

Examples:
>>> from pynsee.macrodata import get_column_title
>>> insee_all_columns = get_column_title()
>>> balance_paiements_columns = get_column_title("BALANCE-PAIEMENTS")

Search geographical data

pynsee.geodata.get_geodata_list(update=False)

Get a list of geographical limits of French administrative areas from IGN API

Args:

update (bool, optional): Trigger an update, otherwise locally saved data is used. Defaults to False.

Examples:
>>> from pynsee.geodata import get_geodata_list
>>> # Get a list of geographical limits of French administrative areas from IGN API
>>> geodata_list = get_geodata_list()

Search local data

pynsee.localdata.get_local_metadata()

Get a list of all combinations of datasets, variables and unit measures available from INSEE Local API

Notes:

This function renders only package’s internal data, it might not be the most up-to-date

Examples:
>>> from pynsee.localdata import get_local_metadata
>>> metadata = get_local_metadata()
pynsee.localdata.get_nivgeo_list()

Get a list of geographic levels

Examples
>>> from pynsee.localdata import get_nivgeo_list
>>> nivgeo_list = get_nivgeo_list()
pynsee.localdata.get_geo_list(geo=None, date=None, update=False)

Get a list of French geographic areas (communes, departements, regions …)

Args:

geo (str): choose among : communes, communesDeleguees, communesAssociees, regions, departements, arrondissements, arrondissementsMunicipaux

date (str): date of validity (AAAA-MM-DD)

update (bool): locally saved data is used by default. Trigger an update with update=True.

Raises:

ValueError: geo should be among the geographic area list

Examples:
>>> from pynsee.localdata.get_geo_list import get_geo_list
>>> city_list = get_geo_list('communes')
>>> region_list = get_geo_list('regions')
>>> departement_list = get_geo_list('departements')
>>> arrondiss_list = get_geo_list('arrondissements')
pynsee.localdata.get_area_list(area=None, date=None, update=False)

Get an exhaustive list of administrative areas : communes, departments, and urban, employment or functional areas

Args:

area (str, optional): Defaults to None, then get all values

date (str): date of validity (AAAA-MM-DD)

update (bool): locally saved data is used by default. Trigger an update with update=True.

Raises:

ValueError: Error if area is not available

Examples:
>>> from pynsee.localdata import get_area_list
>>> area_list = get_area_list()
>>> #
>>> # get list of all communes in France
>>> com = get_area_list(area='communes')

Search metadata

pynsee.metadata.get_definition_list()

Get a list of concept definitions

Examples:
>>> from pynsee.metadata import get_definition_list
>>> definition = get_definition_list()
pynsee.metadata.get_activity_list(level, version='NAFRev2')

Get a list of economic activities from NAF/NACE rev 2 2008 classification

Notes:

This function uses NAF/NACE rev. 2 classification made in 2008. This function renders only package’s internal data.

Args:

level (str): Levels available are : A5, A10, A17, A21, A38, A64, A88, A129, A138, NAF1, NAF2, NAF3, NAF4, NAF5

version (str, optional): Defaults to ‘NAFRev2’.

Raises:

ValueError: an error is raised if level is not in the default list

Examples:
>>> from pynsee.metadata import get_activity_list
>>> activity_A138 = get_activity_list('A138')
>>> activity_NAF3 = get_activity_list('NAF3')
>>> activity_NAF5 = get_activity_list('NAF5')

Search sirene data

pynsee.sirene.get_dimension_list(kind='siret')

Get a list of all columns useful to make queries with search_sirene

Args:

kind (str, optional): Choose between siret and siren. Defaults to ‘siret’.

Examples:
>>> from pynsee.sirene import get_dimension_list
>>> sirene_dimension = get_dimension_list()
pynsee.sirene.search_sirene(variable, pattern, kind='siret', phonetic_search=False, and_condition=True, upper_case=False, decode=False, number=1000, activity=True, legal=False, closed=False, update=False)

Get data about companies from criteria on variables

Args:

variable (str or list): name of the variable on which the search is applied.

pattern (str or list): the pattern or criterium searched

kind (str, optional): kind of companies : siren or siret. Defaults to “siret”

phonetic_search (bool, or list of bool, optional): If True phonetic search is triggered on the all variables of the list, if it is a list of True/False, phonetic search is used accordingly on the list of variables

and_condition (bool, optional): If True, only records meeting all conditions are kept (AND is inserted between the conditions). If False, all records meeting at least one condition are kept (OR is inserted between the conditions).

number (int, optional): Number of companies searched. Defaults to 1000. If it is above 1000, multiple queries are triggered.

upper_case (bool, optional): If True, values of argument ‘pattern’ are converted to upper case and added to the list of searched patterns.

decode (bool, optional): If True, values of argument ‘pattern’ are decoded, especially accents are removed and added to the list of searched patterns.

activity (bool, optional): If True, activty title is added based on NAF/NACE. Defaults to True.

legal (bool, optional): If True, legal entities title are added

closed (bool, optional): If False, closed entities are removed from the data and for each legal entity only the last period for which the data is stable is displayed.

Notes:

This function may return personal data, please check and comply with the legal framework relating to personal data protection

Examples:
>>> from pynsee.metadata import get_activity_list
>>> from pynsee.sirene import search_sirene
>>> from pynsee.sirene import get_dimension_list
>>> #
>>> # Get available column names, it is useful to design your query with search_sirene
>>> sirene_dimension = get_dimension_list()
>>> #
>>> # Get activity list (NAF rev 2)
>>> naf5 = get_activity_list('NAF5')
>>> #
>>> # Get a list of hospitals in Paris
>>> df = search_sirene(variable = ["activitePrincipaleUniteLegale",
>>>                                "codePostalEtablissement"],
>>>                    pattern = ["86.10Z", "75*"], kind = "siret")
>>> #
>>> # Get a list of companies located in Igny city whose name matches with 'pizza' using a phonetic search
>>> df = search_sirene(variable = ["libelleCommuneEtablissement",
>>>                            'denominationUniteLegale'],
>>>                    pattern = ["igny", 'pizza'],
>>>                    phonetic_search=True, kind = "siret")
>>> #
>>> # Get a list of companies whose name matches with 'SNCF' (French national railway company)
>>> # and whose legal status is SAS (societe par actions simplifiee)
>>> df = search_sirene(variable=["denominationUniteLegale",
>>>                              'categorieJuridiqueUniteLegale'],
>>>                    pattern=["sncf", '5710'], kind="siren")
>>> #
>>> # Get data on Hadrien Leclerc
>>> df = search_sirene(variable = ['prenom1UniteLegale', 'nomUniteLegale'],
>>>                           pattern = ['hadrien', 'leclerc'],
>>>                           phonetic_search = [True, False],
>>>                           closed=True)
>>> #
>>> # Find 2500 tobacco shops
>>> df = search_sirene(variable = ['denominationUniteLegale'],
>>>            pattern = ['tabac'],
>>>            number = 2500,
>>>            kind = "siret")
>>> #
>>> # Find 1000 companies whose name sounds like Dassault Système or is a big company (GE), 
>>> # search is made as well on patterns whose accents have been removed
>>> import os
>>> # environment variable 'pynsee_print_url' force the package to print the request 
>>> os.environ["pynsee_print_url"] = 'True'
>>> df = search_sirene(variable = ["denominationUniteLegale", 'categorieEntreprise'],
>>>                 pattern = ['Dassot Système', 'GE'],
>>>                 and_condition = False,
>>>                 upper_case = True,
>>>                 decode = True,
>>>                 update = True,
>>>                 phonetic_search  = [True, False],
>>>                 number = 1000)

Search files on insee.fr

pynsee.download.get_file_list()

Download a list of files available on insee.fr

Returns:

Returns the requested dataframe as a pandas object

Notes:

pynsee.download’s metadata rely on volunteering contributors and their manual updates. get_file_list does not provide data from official Insee’s metadata API. Consequently, please report any issue

Examples:
>>> from pynsee.download import get_file_list
>>> insee_file_list = get_file_list()