If you're new to Graph Databases, you can think of "Classes" and "Properties" along the lines of, respectively, "Table names" and "Table field lists".
If you need to first clear out your test database, one of the cells below (currently commented out) will conveniently let you do it
import os
import sys
import getpass
import pandas as pd
from neoaccess import NeoAccess
from brainannex import NeoSchema
# In case of problems, try a sys.path.append(directory) , where directory is your project's root directory
NeoAccess library¶NOTE: This tutorial is tested on version 4.4 of the Neo4j database, but will probably also work on the new version 5 (NOT guaranteed, however...)
# Save your credentials here - or use the prompts given by the next cell
host = "" # EXAMPLES: bolt://123.456.789.012 OR neo4j://localhost
# (CAUTION: do NOT include the port number!)
password = ""
db = NeoAccess(host=host,
credentials=("neo4j", password), debug=False) # Notice the debug option being OFF
Connection to Neo4j database established.
print("Version of the Neo4j driver: ", db.version())
Version of the Neo4j driver: 4.4.11
NeoSchema library¶# CLEAR OUT THE DATABASE
#db.empty_dbase() # UNCOMMENT IF DESIRED ***************** WARNING: USE WITH CAUTION!!! ************************
NeoSchema.set_database(db)
# Create a "City" Class node - together with its Properties, based on the data to import
NeoSchema.create_class_with_properties(name="City", properties=["City ID", "name"])
(43255, 'schema-1')
# Likewise for a "State" Class node - together with its Properties, based on the data to import
NeoSchema.create_class_with_properties(name="State", properties=["State ID", "name", "2-letter abbr"])
(43258, 'schema-4')
# Now add a relationship named "IS_IN", from the "City" Class to the "State" Class
NeoSchema.create_class_relationship(from_class="City", to_class="State", rel_name="IS_IN")
We'll pass our data as Pandas data frames; those could easily be read in from CSV files, for example
city_df = pd.DataFrame({"City ID": [1, 2, 3, 4], "name": ["Berkeley", "Chicago", "San Francisco", "New York City"]})
city_df
| City ID | name | |
|---|---|---|
| 0 | 1 | Berkeley |
| 1 | 2 | Chicago |
| 2 | 3 | San Francisco |
| 3 | 4 | New York City |
state_df = pd.DataFrame({"State ID": [1, 2, 3], "name": ["California", "Illinois", "New York"], "2-letter abbr": ["CA", "IL", "NY"]})
state_df
| State ID | name | 2-letter abbr | |
|---|---|---|---|
| 0 | 1 | California | CA |
| 1 | 2 | Illinois | IL |
| 2 | 3 | New York | NY |
# In this example, we assume a separate table ("join table") with the data about the relationships;
# this would always be the case for many-to-many relationships;
# 1-to-many relationships, like we have here, could also be stored differently
state_city_links_df = pd.DataFrame({"State ID": [1, 1, 2, 3], "City ID": [1, 3, 2, 4]})
state_city_links_df
| State ID | City ID | |
|---|---|---|
| 0 | 1 | 1 |
| 1 | 1 | 3 |
| 2 | 2 | 2 |
| 3 | 3 | 4 |
NeoSchema.import_pandas_nodes(df=city_df, class_node="City")
import_pandas_nodes(): getting ready to import 4 records...
FINISHED importing a total of 4 records
[43262, 43263, 43264, 43265]
NeoSchema.import_pandas_nodes(df=state_df, class_node="State")
import_pandas_nodes(): getting ready to import 3 records...
FINISHED importing a total of 3 records
[43266, 43267, 43268]
NeoSchema.import_pandas_links(df=state_city_links_df,
col_from="City ID", col_to="State ID",
link_name="IS_IN")
Getting ready to import 4 links...
FINISHED importing a total of 4 links
[36539, 36540, 36541, 36542]
This is what we have created with our import:
