Search data

Search macroeconomic data

pynsee.macrodata.get_dataset_list()

Download a full INSEE’s datasets list from BDM macroeconomic database

Returns:

DataFrame: a dataframe containing the list of datasets available

Examples:
>>> from pynsee.macrodata import get_dataset_list
>>> insee_dataset = get_dataset_list()
pynsee.macrodata.get_series_list(*datasets, update=False)

Download an INSEE’s series key list for one or several datasets from BDM macroeconomic database

Args:

datasets (str) : datasets should be among the datasets list provided by get_dataset_list()

update (bool, optional): Set to True, to update manually the metadata stored locally on the computer. Defaults to False.

Raises:

ValueError: datasets should be among the datasets list provided by get_dataset_list()

Returns:

DataFrame: contains dimension columns, series keys, dataset name

Notes:

Some metadata is stored for 3 months locally on the computer. It is updated automatically

Examples:
>>> from pynsee.macrodata import get_dataset_list, get_series_list
>>> dataset_list = get_dataset_list()
>>> idbank_ipc = get_series_list('IPC-2015', 'CLIMAT-AFFAIRES')
pynsee.macrodata.search_macrodata(pattern='.*', metadata=True)

Search a pattern among insee series (idbanks) from BDM macroeconomic database

Notes:

This function uses package’s internal data which might not be the most up-to-date.

Args:

pattern (str, optional): String used to filter the idbank list. Defaults to “.*”, returns all series.

Examples:
>>> from pynsee.macrodata import search_macrodata
>>> search_all = search_macrodata()
>>> search_paper = search_macrodata("pâte à papier")
>>> search_paris = search_macrodata("PARIS")
>>> search_survey_gdp = search_macrodata("Survey|GDP")
pynsee.macrodata.get_last_release()

Get the datasets from BDM macroeconomic database released in the last 30 days

Examples
>>> from pynsee.macrodata import get_last_release
>>> dataset_released = get_last_release()
pynsee.macrodata.get_column_title(dataset=None)

Get the title of a dataset’s columns

Args:

dataset (str, optional): An INSEE dataset name. Defaults to None, this returns all columns.

Raises:

ValueError: Only one string (length one) ValueError: Dataset must belong to INSEE datasets list

Examples:
>>> from pynsee.macrodata import get_column_title
>>> insee_all_columns = get_column_title()
>>> balance_paiements_columns = get_column_title("BALANCE-PAIEMENTS")

Search geographical data

pynsee.geodata.get_geodata_list(update=False)

Get a list of geographical limits of French administrative areas from IGN API

Args:

update (bool, optional): Trigger an update, otherwise locally saved data is used. Defaults to False.

Examples:
>>> from pynsee.geodata import get_geodata_list
>>> # Get a list of geographical limits of French administrative areas from IGN API
>>> geodata_list = get_geodata_list()

Search local data

pynsee.localdata.get_local_metadata()

Get a list of all combinations of datasets, variables and unit measures available from INSEE Local API

Notes:

This function renders only package’s internal data, it might not be the most up-to-date

Examples:
>>> from pynsee.localdata import get_local_metadata
>>> metadata = get_local_metadata()
pynsee.localdata.get_nivgeo_list()

Get a list of geographic levels

Examples
>>> from pynsee.localdata import get_nivgeo_list
>>> nivgeo_list = get_nivgeo_list()
pynsee.localdata.get_geo_list(geo=None, date=None, update=False)

Get a list of French geographic areas (communes, departements, regions …)

Args:

geo (str): choose among : communes, communesDeleguees, communesAssociees, regions, departements, arrondissements, arrondissementsMunicipaux

date (str): date of validity (AAAA-MM-DD)

update (bool): locally saved data is used by default. Trigger an update with update=True.

Raises:

ValueError: geo should be among the geographic area list

Examples:
>>> from pynsee.localdata.get_geo_list import get_geo_list
>>> city_list = get_geo_list('communes')
>>> region_list = get_geo_list('regions')
>>> departement_list = get_geo_list('departements')
>>> arrondiss_list = get_geo_list('arrondissements')
pynsee.localdata.get_area_list(area=None, date=None, update=False)

Get an exhaustive list of administrative areas : communes, departments, and urban, employment or functional areas

Args:

area (str, optional): Defaults to None, then get all values

date (str): date of validity (AAAA-MM-DD)

update (bool): locally saved data is used by default. Trigger an update with update=True.

Raises:

ValueError: Error if area is not available

Examples:
>>> from pynsee.localdata import get_area_list
>>> area_list = get_area_list()
>>> #
>>> # get list of all communes in France
>>> com = get_area_list(area='communes')

Search metadata

pynsee.metadata.get_definition_list()

Get a list of concept definitions

Examples:
>>> from pynsee.metadata import get_definition_list
>>> definition = get_definition_list()
pynsee.metadata.get_activity_list(level, version='NAFRev2')

Get a list of economic activities from NAF/NACE rev 2 2008 classification

Notes:

This function uses NAF/NACE rev. 2 classification made in 2008. This function renders only package’s internal data.

Args:

level (str): Levels available are : A5, A10, A17, A21, A38, A64, A88, A129, A138, NAF1, NAF2, NAF3, NAF4, NAF5

version (str, optional): Defaults to ‘NAFRev2’.

Raises:

ValueError: an error is raised if level is not in the default list

Examples:
>>> from pynsee.metadata import get_activity_list
>>> activity_A138 = get_activity_list('A138')
>>> activity_NAF3 = get_activity_list('NAF3')
>>> activity_NAF5 = get_activity_list('NAF5')

Search sirene data

pynsee.sirene.get_dimension_list(kind='siret')

Get a list of all columns useful to make queries with search_sirene

Args:

kind (str, optional): Choose between siret and siren. Defaults to ‘siret’.

Examples:
>>> from pynsee.sirene import get_dimension_list
>>> sirene_dimension = get_dimension_list()
pynsee.sirene.search_sirene(variable, pattern, kind='siret', phonetic_search=False, number=1000, activity=True, legal=False, closed=False, update=False)

Get data about companies from criteria on variables

Args:

variable (str or list): name of the variable on which the search is applied.

pattern (str or list): the pattern or criterium searched

kind (str, optional): kind of companies : siren or siret. Defaults to “siret”

phonetic_search (bool, or list of bool, optional): If True phonetic search is triggered on the all variables of the list, if it is a list of True/False, phonetic search is used accordingly on the list of variables

number (int, optional): Number of companies searched. Defaults to 1000. If it is above 1000, multiple queries are triggered.

activity (bool, optional): If True, activty title is added based on NAF/NACE. Defaults to True.

legal (bool, optional): If True, legal entities title are added

closed (bool, optional): If False, closed entities are removed from the data and for each legal entity only the last period for which the data is stable is displayed.

Notes:

This function may return personal data, please check and comply with the legal framework relating to personal data protection

Examples:
>>> from pynsee.metadata import get_activity_list
>>> from pynsee.sirene import search_sirene
>>> from pynsee.sirene import get_dimension_list
>>> #
>>> # Get available column names, it is useful to design your query with search_sirene
>>> sirene_dimension = get_dimension_list()
>>> #
>>> # Get activity list (NAF rev 2)
>>> naf5 = get_activity_list('NAF5')
>>> #
>>> # Get a list of hospitals in Paris
>>> df = search_sirene(variable = ["activitePrincipaleUniteLegale",
>>>                                "codePostalEtablissement"],
>>>                    pattern = ["86.10Z", "75*"], kind = "siret")
>>> #
>>> # Get a list of companies located in Igny city whose name matches with 'pizza' using a phonetic search
>>> df = search_sirene(variable = ["libelleCommuneEtablissement",
>>>                            'denominationUniteLegale'],
>>>                    pattern = ["igny", 'pizza'],
>>>                    phonetic_search=True, kind = "siret")
>>> #
>>> # Get a list of companies whose name matches with 'SNCF' (French national railway company)
>>> # and whose legal status is SAS (societe par actions simplifiee)
>>> df = search_sirene(variable=["denominationUniteLegale",
>>>                              'categorieJuridiqueUniteLegale'],
>>>                    pattern=["sncf", '5710'], kind="siren")
>>> #
>>> # Get data on Hadrien Leclerc
>>> df = search_sirene(variable = ['prenom1UniteLegale', 'nomUniteLegale'],
>>>                           pattern = ['hadrien', 'leclerc'],
>>>                           phonetic_search = [True, False],
>>>                           closed=True)
>>> #
>>> # Find 2500 tobacco shops
>>> df = search_sirene(variable = ['denominationUniteLegale'],
>>>            pattern = ['tabac'],
>>>            number = 2500,
>>>            kind = "siret")

Search files on insee.fr

pynsee.download.get_file_list()

Download a list of files available on insee.fr

Returns:

Returns the requested dataframe as a pandas object

Notes:

pynsee.download’s metadata rely on volunteering contributors and their manual updates. get_file_list does not provide data from official Insee’s metadata API. Consequently, please report any issue

Examples:
>>> from pynsee.download import get_file_list
>>> insee_file_list = get_file_list()