Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PostgreSQL Apache AGE Rails (GraphDB driver questions / problems) #51843

Closed
btihen opened this issue May 15, 2024 · 6 comments
Closed

PostgreSQL Apache AGE Rails (GraphDB driver questions / problems) #51843

btihen opened this issue May 15, 2024 · 6 comments

Comments

@btihen
Copy link

btihen commented May 15, 2024

Steps to reproduce

I am writing a Rails PG Apache AGE adapter (https://age.apache.org/) a PG GraphDB extension. I have the basics working in development mode, however, I am not sure how to get it to work with both development and test.

I am using the database Config:

default: &default
  adapter: postgresql
  encoding: unicode
  username: postgresUser
  password: postgresPW
  host: localhost
  port: 5455
  # For details on connection pooling, see Rails configuration guide
  # https://guides.rubyonrails.org/configuring.html#database-pooling
  pool: <%= ENV.fetch("RAILS_MAX_THREADS") { 5 } %>
  schema_search_path: "ag_catalog,age_schema,public"

development:
  <<: *default
  database: graphdb_mini_development

test:
  <<: *default
  database: graphdb_mini_test

production:
  <<: *default
  database: graphdb_mini_production
  username: graphdb_mini
  password: <%= ENV["GRAPHDB_MINI_DATABASE_PASSWORD"] %>

I have a setup migration with:

class AddAgeBaseMigration < ActiveRecord::Migration[7.1]
  def change
    # allow age extension
    execute("CREATE EXTENSION IF NOT EXISTS age;")
    # load the age code
    execute("LOAD 'age';")
    # load the ag_catalog into the search path
    execute('SET search_path = ag_catalog, "$user", public;')

    # creates our AGE schema
    # USE: `execute("SELECT create_graph('age_schema');")`, as we need to use: `create_graph` function
    # NOT: `ActiveRecord::Base.connection.create_schema('age_schema')`
    execute("SELECT create_graph('age_schema');")
  end
end

all goes well when I do:

~/devel/learn/mentor/age/graphdb_mini main ?25                                                                     Ruby 3.3.1 Node 20.12.0 Ruby ruby-3.3.1 19:56:59
wti0405@GREM-VPJQF2D52M ❯ bin/rails db:create         
Created database 'graphdb_mini_development'
Created database 'graphdb_mini_test'

~/devel/learn/mentor/age/graphdb_mini main ?25                                                                     Ruby 3.3.1 Node 20.12.0 Ruby ruby-3.3.1 19:57:07
wti0405@GREM-VPJQF2D52M ❯ bin/rails db:migrate        
== 20240514224709 AddAgeBaseMigration: migrating ==============================
-- execute("CREATE EXTENSION IF NOT EXISTS age;")
   -> 0.0391s
-- execute("LOAD 'age';")
   -> 0.0044s
-- execute("SET search_path = ag_catalog, \"$user\", public;")
   -> 0.0025s
-- execute("SELECT create_graph('age_schema');")
   -> 0.0117s
== 20240514224709 AddAgeBaseMigration: migrated (0.0586s) =====================

unknown OID 4089: failed to recognize type of 'namespace'. It will be treated as String.
unknown OID 2205: failed to recognize type of 'relation'. It will be treated as String.

and I get the schema:

ActiveRecord::Schema[7.1].define(version: 2024_05_05_183043) do
  create_schema "ag_catalog"
  create_schema "age_schema"

  # These are extensions that must be enabled in order to support this database
  enable_extension "age"
  enable_extension "plpgsql"

  # Could not dump table "_ag_label_edge" because of following StandardError
  #   Unknown type 'graphid' for column 'id'
  
  # Could not dump table "_ag_label_vertex" because of following StandardError
  #   Unknown type 'graphid' for column 'id'
  
  # Could not dump table "ag_graph" because of following StandardError
  #   Unknown type 'regnamespace' for column 'namespace'
  
  # Could not dump table "ag_label" because of following StandardError
  #   Unknown type 'regclass' for column 'relation'

  add_foreign_key "ag_label", "ag_graph", column: "graph", primary_key: "graphid", name: "fk_graph_oid"
end

when I look in the database AGE Extension (only it creates the following schemas and tables - in case that helps):

graphdb_age_development=# \dt+ ag_catalog.*
                                           List of relations
   Schema   |   Name   | Type  |    Owner     | Persistence | Access method |    Size    | Description 
------------+----------+-------+--------------+-------------+---------------+------------+-------------
 ag_catalog | ag_graph | table | postgresUser | permanent   | heap          | 8192 bytes | 
 ag_catalog | ag_label | table | postgresUser | permanent   | heap          | 8192 bytes | 
(2 rows)


graphdb_age_development=# \d+ ag_catalog.ag_graph
                                          Table "ag_catalog.ag_graph"
  Column   |     Type     | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
-----------+--------------+-----------+----------+---------+---------+-------------+--------------+-------------
 graphid   | oid          |           | not null |         | plain   |             |              | 
 name      | name         |           | not null |         | plain   |             |              | 
 namespace | regnamespace |           | not null |         | plain   |             |              | 
Indexes:
    "ag_graph_graphid_index" UNIQUE, btree (graphid)
    "ag_graph_name_index" UNIQUE, btree (name)
    "ag_graph_namespace_index" UNIQUE, btree (namespace)
Referenced by:
    TABLE "ag_catalog.ag_label" CONSTRAINT "fk_graph_oid" FOREIGN KEY (graph) REFERENCES ag_catalog.ag_graph(graphid)
Access method: heap


graphdb_age_development=# \d+ ag_catalog.ag_label
                                              Table "ag_catalog.ag_label"
  Column  |         Type          | Collation | Nullable | Default | Storage | Compression | Stats target | Description 
----------+-----------------------+-----------+----------+---------+---------+-------------+--------------+-------------
 name     | name                  |           | not null |         | plain   |             |              | 
 graph    | oid                   |           | not null |         | plain   |             |              | 
 id       | ag_catalog.label_id   |           |          |         | plain   |             |              | 
 kind     | ag_catalog.label_kind |           |          |         | plain   |             |              | 
 relation | regclass              |           | not null |         | plain   |             |              | 
 seq_name | name                  |           | not null |         | plain   |             |              | 
Indexes:
    "ag_label_graph_oid_index" UNIQUE, btree (graph, id)
    "ag_label_name_graph_index" UNIQUE, btree (name, graph)
    "ag_label_relation_index" UNIQUE, btree (relation)
    "ag_label_seq_name_graph_index" UNIQUE, btree (seq_name, graph)
Foreign-key constraints:
    "fk_graph_oid" FOREIGN KEY (graph) REFERENCES ag_catalog.ag_graph(graphid)
Access method: heap



graphdb_age_development=# \d+ age_schema._ag_label_edge
                                                                                                                  Table "age_schema._ag_label_edge"
   Column   |        Type        | Collation | Nullable |                                                                        Default                                                                        | Storage  | Compression | Stats target | Description 
------------+--------------------+-----------+----------+-------------------------------------------------------------------------------------------------------------------------------------------------------+----------+-------------+--------------+-------------
 id         | ag_catalog.graphid |           | not null | ag_catalog._graphid(ag_catalog._label_id('age_schema'::name, '_ag_label_edge'::name)::integer, nextval('age_schema._ag_label_edge_id_seq'::regclass)) | plain    |             |              | 
 start_id   | ag_catalog.graphid |           | not null |                                                                                                                                                       | plain    |             |              | 
 end_id     | ag_catalog.graphid |           | not null |                                                                                                                                                       | plain    |             |              | 
 properties | ag_catalog.agtype  |           | not null | ag_catalog.agtype_build_map()                                                                                                                         | extended |             |              | 
Indexes:
    "_ag_label_edge_pkey" PRIMARY KEY, btree (id)
Access method: heap

now I can successfully do (I assume the warnings aren't a problem):
bin/rails c
and test lots of interesting stuff including my code.

mini example - create a simple Node with:

module AgeSchema
  module Nodes
    class Company
      include ApacheAge::Vertex

      attribute :company_name, :string
    end
  end
end

Usage:

quarry = AgeSchema::Nodes::Company.create(company_name: 'Bedrock Quarry')

All works well in development mode.

Expected behavior

When I run tests the migrations that have already run be detected in the test ENV and the tests should run. Without the Apache AGE extension - everything works as expected. I can run migrations, and tests and it detects the migrations ran.

Actual behavior

When I want to run tests migrations I detects they have already run (I can also run in development mode as often as I want and the migrations are detected and all works well), but when I run tests - even using RAILS_ENV=test, it no longer detects the migrations and I get migration errors:

❯ bin/rails db:drop                  
Dropped database 'graphdb_age_development'
Dropped database 'graphdb_age_test'

❯ bin/rails db:create                
Created database 'graphdb_age_development'
Created database 'graphdb_age_test'

❯ bin/rails db:migrate RAILS_ENV=test

== 20240505183043 AddAgeSetup: migrating ======================================
-- execute("CREATE EXTENSION IF NOT EXISTS age;")
   -> 0.0328s
-- execute("LOAD 'age';")
   -> 0.0004s
-- execute("SET search_path = ag_catalog, \"$user\", public;")
   -> 0.0003s
-- execute("SELECT create_graph('age_schema');")
   -> 0.0107s
== 20240505183043 AddAgeSetup: migrated (0.0443s) =============================

unknown OID 4089: failed to recognize type of 'namespace'. It will be treated as String.
unknown OID 2205: failed to recognize type of 'relation'. It will be treated as String.


❯ bin/rails db:migrate RAILS_ENV=test

unknown OID 47108: failed to recognize type of 'id'. It will be treated as String.
unknown OID 47140: failed to recognize type of 'properties'. It will be treated as String.
unknown OID 4089: failed to recognize type of 'namespace'. It will be treated as String.
unknown OID 2205: failed to recognize type of 'relation'. It will be treated as String.

❯ bin/rails db:migrate RAILS_ENV=test

unknown OID 47108: failed to recognize type of 'id'. It will be treated as String.
unknown OID 47140: failed to recognize type of 'properties'. It will be treated as String.
unknown OID 4089: failed to recognize type of 'namespace'. It will be treated as String.
unknown OID 2205: failed to recognize type of 'relation'. It will be treated as String.

❯ bin/rails test RAILS_ENV=test      

bin/rails aborted!
ActiveRecord::StatementInvalid: PG::DuplicateObject: ERROR:  constraint "fk_graph_oid" for relation "ag_label" already exists (ActiveRecord::StatementInvalid)
/Users/wti0405/devel/learn/mentor/age/graphdb_age/db/schema.rb:33:in `block in <main>'
/Users/wti0405/devel/learn/mentor/age/graphdb_age/db/schema.rb:13:in `<main>'

Caused by:
PG::DuplicateObject: ERROR:  constraint "fk_graph_oid" for relation "ag_label" already exists (PG::DuplicateObject)
/Users/wti0405/devel/learn/mentor/age/graphdb_age/db/schema.rb:33:in `block in <main>'
/Users/wti0405/devel/learn/mentor/age/graphdb_age/db/schema.rb:13:in `<main>'
Tasks: TOP => db:test:load_schema
(See full trace by running task with --trace)
Migrations are pending. To resolve this issue, run:

        bin/rails db:migrate

You have 1 pending migration:

db/migrate/20240505183043_add_age_setup.rb

While loading ./spec/age_schema/edges/works_at_spec.rb an `exit` / `raise SystemExit` occurred, RSpec will now quit.
Failure/Error: abort e.to_s.strip

SystemExit:
  Migrations are pending. To resolve this issue, run:

          bin/rails db:migrate

  You have 1 pending migration:

  db/migrate/20240505183043_add_age_setup.rb
# ./spec/rails_helper.rb:31:in `abort'
# ./spec/rails_helper.rb:31:in `rescue in <top (required)>'
# ./spec/rails_helper.rb:28:in `<top (required)>'
# ./spec/age_schema/edges/works_at_spec.rb:3:in `<top (required)>'
# ------------------
# --- Caused by: ---
# ActiveRecord::PendingMigrationError:
#   
#   
#   Migrations are pending. To resolve this issue, run:
#   
#           bin/rails db:migrate
#   
#   You have 1 pending migration:
#   
#   db/migrate/20240505183043_add_age_setup.rb
#   ./spec/rails_helper.rb:29:in `<top (required)>'

I am not sure why this last line of the schema exists and how it gets there, I think that is the reason for the crash, but maybe it really doesn't know that the migration already ran. This error happens even if I ensure the test db migration ran with:

bin/rails db:migrate RAILS_ENV=test

Can someone help me move forward / explain what I am overlooking or point me to where I can get help or appropriate documentation.

I read with each DB connection (like when I use psql I need first run)

LOAD 'age';
SET search_path = ag_catalog, \"$user\", public;

I don't know how to, but perhaps I need to do something like that first with the PG connection and then load the tests. Might that help?

System configuration

Rails version: Rails 7.1.3.2

Ruby version: 3.3.1

@btihen
Copy link
Author

btihen commented May 18, 2024

I have found a workaround - if I rewrite the schema.rb to:

ActiveRecord::Schema[7.1].define(version: 2024_05_05_183043) do
  execute('CREATE EXTENSION IF NOT EXISTS age;')
  execute <<-SQL
    DO $$
    BEGIN
      IF NOT EXISTS (
        SELECT 1
        FROM pg_namespace
        WHERE nspname = 'ag_catalog'
      ) THEN
        CREATE SCHEMA ag_catalog;
      END IF;
    END $$;
  SQL

  # These are extensions that must be enabled in order to support this database
  enable_extension 'age'
  enable_extension 'plpgsql'

  # Load the age code
  execute("LOAD 'age';")

  # Load the ag_catalog into the search path
  execute('SET search_path = ag_catalog, "$user", public;')

  execute <<-SQL
    DO $$
    BEGIN
      IF NOT EXISTS (
        SELECT 1
        FROM pg_constraint
        WHERE conname = 'fk_graph_oid'
      ) THEN
        ALTER TABLE ag_label ADD CONSTRAINT fk_graph_oid FOREIGN KEY (graph) REFERENCES ag_graph (graphid);
      END IF;
    END $$;
  SQL

  # create_schema 'age_schema'
  execute <<-SQL
    DO $$
    BEGIN
      IF NOT EXISTS (
        SELECT 1
        FROM ag_catalog.ag_graph
        WHERE name = 'age_schema'
      ) THEN
        PERFORM create_graph('age_schema');
      END IF;
    END $$;
  SQL
end

instead of - which indeed different than the migration (& incorrect):

ActiveRecord::Schema[7.1].define(version: 2024_05_05_183043) do
  create_schema "ag_catalog"
  create_schema "age_schema"

  # These are extensions that must be enabled in order to support this database
  enable_extension "age"
  enable_extension "plpgsql"

  add_foreign_key "ag_label", "ag_graph", column: "graph", primary_key: "graphid", name: "fk_graph_oid"
end

unfortunately, I need to reset the schema.rb after each new migration. Is there a solution to control the schema.rb generation and make it compatible with ApaageAGE?

@simi
Copy link
Contributor

simi commented May 18, 2024

Can you try with structure.sql instead of schema.rb? https://edgeguides.rubyonrails.org/active_record_migrations.html#schema-dumping-and-you

@mashallah
Copy link

mashallah commented May 20, 2024

@btihen thanks for working on this, I'd also be interested in exploring AGE for graph support.

You probably already saw this, but some ideas if name collision hasn't already been ruled out:
PG::DuplicateObject: ERROR: constraint "fk_graph_oid" for relation "ag_label" already exists (PG::DuplicateObject).

DBMS-level name collision?

You may have ruled this out already, but I wonder if create_graph('age_schema') is running twice, once successfully in dev migrations, then again in test, but failing since it has the same age_schema namespace, and that namespace is being shared across the dev and test databases.

Potential fixes

Separate graph namespaces for dev and test

You might be able to find examples among gems that have implemented support for PG extensions like activerecord-postgis-adapter and see how the situations where a common extension catalog and schema can handle this.

Option 1 - Isolate by DB, nothing shared at DBMS (preferred)

  • If AGE allows complete isolation where the graph and extension are unique to the dev or test DB, rather than sharing some elements like the graph name at the DBMS level, this might be easiest. In this approach, after migration there's an age_schema graph in your dev db and completely independent age_schema graph in your dev db. If AGE supports DB-level isolation of graphs like this instead of DBMS-level, this might be your best option.

Option 2 - Isolate by graph name

  • Modify execute("SELECT create_graph('age_schema');") to avoid dev/test namespace collision for the name age_schema, the end goal would be that when run in test mode the name of age_schema would be changing the age_schema_test. So you'd have age_schema and age_schema_test after running migrations.

Note this should be avoided if possible, I'd only consider it if AGE does not support DB-level isolation

Single graph namespace

  • Goal: modify execute("SELECT create_graph('age_schema');") by adding the equivalent of an IF NOT EXISTS statement.

Option 1 - Add check for existing AGE graph using AGE functions

  • Since create_graph is a function, you may want to make use of the AGE graph_exists() or get_graphnames() functions, integrating them into the equivalent of an IF NOT EXISTS check in the graph creation query above.

Option 2 - Add check for existing AGE graph using Postgres DDL queries

  • This would require running the create_graph function, determining the underlying structures you want to look for that are often created by create_graph then query to determine if they are present.

@btihen
Copy link
Author

btihen commented May 20, 2024

@simi

Can you try with structure.sql instead of schema.rb?

I will need to experiment a bit more. I am using a dockerized version of the DB with AGE (Postgres v16) but my psql version is postgresql v14). At the moment i can't change this setup n this computer - i'll see if i can get another setup going.

ActiveRecord::Schema[7.1].define(version: 20_240_505_183_043) do
  # Allow age extension
  execute('CREATE EXTENSION IF NOT EXISTS age;')

  # Load the age code
  execute("LOAD 'age';")

  # Load the ag_catalog into the search path
  execute('SET search_path = ag_catalog, "$user", public;')

  # Create age_schema graph if it doesn't exist
  execute("SELECT create_graph('age_schema');")
end

in the meantime I found I can just copy the migration into the schema.rb and just reset it with careful commits.

@btihen
Copy link
Author

btihen commented May 20, 2024

thanks for working on this, I'd also be interested in exploring AGE for graph support.

You probably already saw this, but some ideas if name collision hasn't already been ruled out:
PG::DuplicateObject: ERROR: constraint "fk_graph_oid" for relation "ag_label" already exists (PG::DuplicateObject).

@mashallah - Hi I'd be glad to collaborate.

I did see the name collision, but I've been ignoring them since I discoved the problem was that the schema is incorrectly generated - when i copy the migration into the schema - \i can run tests and there is no name collision - the schema just needs to be correct.

I have some basic working rails code at: https://github.com/btihen-dev/rails_graphdb_age_app (with tests running)

PS - code is still evolving, only handles ons schema, and doesn't work with all data types supported by AGE & I havn;t yet tested unique restrictions.

@rafaelfranca
Copy link
Member

Please use https://discuss.rubyonrails.org/ for questions/help, where a wider community will be able to help you. We reserve the issues tracker for issues only.

@rafaelfranca rafaelfranca closed this as not planned Won't fix, can't repro, duplicate, stale May 21, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants