fix default_paramstyle #27

koxudaxi · 2019-10-15T19:05:06Z

The PR fixes default_paramstyle for dialect.

Related Issue

codecov · 2019-10-15T19:07:23Z

Codecov Report

Merging #27 into master will not change coverage.
The diff coverage is 100%.

@@          Coverage Diff          @@
##           master    #27   +/-   ##
=====================================
  Coverage     100%   100%           
=====================================
  Files           4      4           
  Lines         393    391    -2     
  Branches       48     48           
=====================================
- Hits          393    391    -2

Impacted Files	Coverage Δ
pydataapi/dbapi.py	`100% <100%> (ø)`	⬆️
pydataapi/pydataapi.py	`100% <100%> (ø)`	⬆️
pydataapi/dialect.py	`100% <100%> (ø)`	⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b2ea173...f68ed18. Read the comment docs.

Rubyj · 2019-10-15T19:16:27Z

with 2267650 I still receive the pydantic error

koxudaxi · 2019-10-15T19:22:46Z

I just fixed it. it work

    from sqlalchemy.orm import sessionmaker
    Session = sessionmaker()
    Session.configure(bind=engine)
    s = Session()
    s.bulk_save_objects([Pets(name='pet_name1'), Pets(name='pet_name2')])
    s.commit()

output

2019-10-16 04:20:19,575 INFO sqlalchemy.engine.base.Engine select version()
2019-10-16 04:20:19,575 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 04:20:19,588 INFO sqlalchemy.engine.base.Engine select current_schema()
2019-10-16 04:20:19,588 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 04:20:19,603 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS VARCHAR(60)) AS anon_1
2019-10-16 04:20:19,603 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 04:20:19,611 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS VARCHAR(60)) AS anon_1
2019-10-16 04:20:19,611 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 04:20:19,618 INFO sqlalchemy.engine.base.Engine show standard_conforming_strings
2019-10-16 04:20:19,618 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 04:20:19,642 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-10-16 04:20:19,654 INFO sqlalchemy.engine.base.Engine INSERT INTO pets (name) VALUES (:name)
2019-10-16 04:20:19,654 INFO sqlalchemy.engine.base.Engine ({'name': 'pet_name1'}, {'name': 'pet_name2'})
2019-10-16 04:20:19,665 INFO sqlalchemy.engine.base.Engine COMMIT

Process finished with exit code 0

koxudaxi · 2019-10-15T19:23:51Z

@Rubyj
Would you please test it?

Rubyj · 2019-10-15T19:32:50Z

@koxudaxi I still receive an error but it is unrelated:

Traceback (most recent call last):
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/chalice/app.py", line 1082, in _get_view_function_response
    response = view_function(**function_args)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/app.py", line 26, in index
    parse_vcf(n)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/app.py", line 39, in parse_vcf
    BulkInsert().go(session, variants)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/chalicelib/queries.py", line 3, in go
    session.bulk_save_objects(inserts)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2700, in bulk_save_objects
    False,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2888, in _bulk_save_mappings
    transaction.rollback(_capture_exception=True)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2882, in _bulk_save_mappings
    render_nulls,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 102, in _bulk_insert
    bookkeeping=return_defaults,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1084, in _emit_insert_statements
    c = cached_connections[connection].execute(statement, multiparams)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1468, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1224, in _execute_context
    cursor, statement, parameters, context
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 549, in do_executemany
    cursor.executemany(statement, parameters)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/pydataapi/dbapi.py", line 140, in executemany
    results = self._data_api.batch_execute(operation, seq_of_parameters)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/pydataapi/pydataapi.py", line 399, in batch_execute
    **options.build()
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.errorfactory.BadRequestException: An error occurred (BadRequestException) when calling the BatchExecuteStatement operation: Number of SQL parameters specified is more than 1000

Is there a way for me to increase this limit? Or will this bulk insert take too long.

koxudaxi · 2019-10-15T19:35:48Z

I found it in https://docs.aws.amazon.com/rdsdataservice/latest/APIReference/API_BatchExecuteStatement.html

parameterSets
The parameter set for the batch operation.

The maximum number of parameters in a parameter set is 1,000.

I think the limitation can not be changed. We can ask about it to AWS supports...

Rubyj · 2019-10-15T19:43:13Z

I found it in https://docs.aws.amazon.com/rdsdataservice/latest/APIReference/API_BatchExecuteStatement.html
parameterSets
The parameter set for the batch operation.

The maximum number of parameters in a parameter set is 1,000.

Yeah, I just saw that as well. Do you have any ideas for workaround? Build multiple lists each being 1000 long maybe?

koxudaxi · 2019-10-15T19:45:39Z

Build multiple lists each being 1000 long maybe?

@Rubyj
If you can limit the count of bulk insert objects then, I think it is the best way. because DataAPI has a few limitations on operations.
Could you save the count?

Rubyj · 2019-10-15T19:47:20Z

Build multiple lists each being 1000 long maybe?

@Rubyj
If you can limit the count of bulk insert objects then, I think it is the best way. because DataAPI has a few limitations on operations.
Could you save the count?

I have about 60k objects to bulk insert. In theory, I can make 60 lists each with 1000 objects and bulk insert them all.

koxudaxi · 2019-10-15T19:58:06Z

We have another way that the library divide objects each 1000. And the library call operations.
I'm thinking whether or not do it because I think It is a tricky method.

Rubyj · 2019-10-15T20:00:28Z

We have another way that the library divide objects each 1000. And the library call operations.
I'm thinking whether or not do it because I think It is a tricky method.

I will try to divide my data by 1000. It should not be too hard.

Rubyj · 2019-10-15T20:08:22Z

@koxudaxi I was able to batch my data by 1000. However, now I run into the following error.

Traceback (most recent call last):
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/chalice/app.py", line 1082, in _get_view_function_response
    response = view_function(**function_args)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/app.py", line 26, in index
    parse_vcf(n)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/app.py", line 41, in parse_vcf
    BulkInsert().go(session, annotation_list)
  File "/home/rjacobs/git/vcfparserlambda/src/parser/chalicelib/queries.py", line 3, in go
    session.bulk_save_objects(inserts)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2700, in bulk_save_objects
    False,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2888, in _bulk_save_mappings
    transaction.rollback(_capture_exception=True)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/langhelpers.py", line 68, in __exit__
    compat.reraise(exc_type, exc_value, exc_tb)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/session.py", line 2882, in _bulk_save_mappings
    render_nulls,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 102, in _bulk_insert
    bookkeeping=return_defaults,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/orm/persistence.py", line 1084, in _emit_insert_statements
    c = cached_connections[connection].execute(statement, multiparams)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 988, in execute
    return meth(self, multiparams, params)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 287, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1107, in _execute_clauseelement
    distilled_params,
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1248, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1468, in _handle_dbapi_exception
    util.reraise(*exc_info)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 153, in reraise
    raise value
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1224, in _execute_context
    cursor, statement, parameters, context
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 549, in do_executemany
    cursor.executemany(statement, parameters)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/pydataapi/dbapi.py", line 140, in executemany
    results = self._data_api.batch_execute(operation, seq_of_parameters)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/pydataapi/pydataapi.py", line 399, in batch_execute
    **options.build()
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/home/rjacobs/git/vcfparserlambda/venv/lib/python3.7/site-packages/botocore/client.py", line 661, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (SerializationException) when calling the BatchExecuteStatement operation:

This happens on the model with a relationship to the other model

class VariantAnnotation(Base):
    variant = relationship('Variant', backref='annotations')

variant_object = Variant()
variant_annotation = VariantAnnotation(variant=variant_object)

koxudaxi · 2019-10-15T20:11:29Z

@Rubyj
Would you please set echo=True?
I'm interested in SerializationException

Rubyj · 2019-10-15T20:17:39Z

@Rubyj
Would you please set echo=True?
I'm interested in SerializationException

I set echo=True for SQLAlcemy create_engine but because I am using chalice I do not see any logging. Hmm. How can I make this work with chalice?

koxudaxi · 2019-10-15T20:21:53Z

@Rubyj
Sorry, I must go to bed. I will continue it tomorrow.

Rubyj · 2019-10-15T20:23:30Z

@Rubyj
Sorry, I must go to bed. I will continue it tomorrow.

Ok! I will continue to debug today and post what I find for you to see tomorrow. I can run outside of chalice using regular python. Thank you!!

Even outside of chalice with echo=True in create_engine I see no extra logging. 😞

koxudaxi · 2019-10-15T20:36:14Z

It works to run pydataapi(mysql) with sam
query is ...

    result: ResultProxy = engine.execute("select * from pets")

output is

Rubyj · 2019-10-15T20:40:29Z

It works to run pydataapi(mysql) with sam
query is ...
    result: ResultProxy = engine.execute("select * from pets")
output is

Weird. I wonder why I can not get it to work in my sam project with postgres. I can try again. Did you have to do anything special?

@koxudaxi I took a look at the data that is being sent to the data api from python. It looks like the foreign key which i showed above is not included in the the data that is being sent to _make_api_call().

Here is an example:

[
  {
    "name": "allele",
    "value": {
      "stringValue": "T"
    }
  },
  {
    "name": "annotation",
    "value": {
      "stringValue": "test"
    }
  },
  {
    "name": "annotation_impact",
    "value": {
      "stringValue": "test"
    }
  },
  {
    "name": "gene_name",
    "value": {
      "stringValue": "test"
    }
  },
  {
    "name": "gene_id",
    "value": {
      "stringValue": "test"
    }
  },
  {
    "name": "hgvs_c",
    "value": {
      "stringValue": "*test\u003eT"
    }
  },
  {
    "name": "hgvs_p",
    "value": {
      "doubleValue": "nan"
    }
  },
  {
    "name": "reference",
    "value": {
      "stringValue": "test"
    }
  }
]

I do not see the reference back to the Variant object included in this data. I make that reference as described here: https://stackoverflow.com/a/17330019/3439441

class Child(Base):
   parent_id = Column(Integer, ForeignKey('parent.id'))
   parent = relationship('Parent', backref='childs')

parent_object = Parent(name="test")
child_object = Child(parent=parent_object)
session.bulk_save_objects([parent_object])
session.bulk_save_objects([child_object]) ------ this is where error occurs

Assigning the parent object like I did here in my code does not seem to make it to the make_api_call

We could start with something like this:

u = User(user_name=u'dusual')
# no need to flush, no need to add `u` to the session because sqlalchemy becomes aware of the object once we assign it to c.user
c = Client(user=u, orgname="dummy_org")
session.add(c)

koxudaxi · 2019-10-16T05:10:34Z

I have test simple relation tables. It works fine. However, I have run with Aurora Serverless MySQL.
I will create a PostgresSQL Cluster and test again tonight.

class Parent(base):
    __tablename__ = 'parent'
    id = Column(Integer, primary_key=True)
    name = Column(String(255, collation='utf8_unicode_ci'), default=None)


class Child(base):
    __tablename__ = 'child'
    id = Column(Integer, primary_key=True)
    name = Column(String(255, collation='utf8_unicode_ci'), default=None)
    parent_id = Column(Integer, ForeignKey('parent.id'))
    parent = relationship('Parent', backref='childs')
    s = Session()
    parent_object = Parent(id=1, name="test")
    child_object = Child(id=1, parent=parent_object)
    s.bulk_save_objects([parent_object])
    s.bulk_save_objects([child_object])
...

@koxudaxi I took a look at the data that is being sent to the data api from python. It looks like the foreign key which i showed above is not included in the the data that is being sent to

I think SQLAlchemy split SQL to parent and child
I put raw operations.

2019-10-16 14:01:58,414 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'sql_mode'
2019-10-16 14:01:58,414 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 14:01:58,536 INFO sqlalchemy.engine.base.Engine SHOW VARIABLES LIKE 'lower_case_table_names'
2019-10-16 14:01:58,536 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 14:01:58,665 INFO sqlalchemy.engine.base.Engine SELECT DATABASE()
2019-10-16 14:01:58,666 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 14:01:59,105 INFO sqlalchemy.engine.base.Engine SELECT CAST('test plain returns' AS CHAR(60)) AS anon_1
2019-10-16 14:01:59,105 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 14:01:59,165 INFO sqlalchemy.engine.base.Engine SELECT CAST('test unicode returns' AS CHAR(60)) AS anon_1
2019-10-16 14:01:59,165 INFO sqlalchemy.engine.base.Engine {}
2019-10-16 14:01:59,294 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-10-16 14:01:59,354 INFO sqlalchemy.engine.base.Engine INSERT INTO parent (id, name) VALUES (:id, :name)
2019-10-16 14:01:59,354 INFO sqlalchemy.engine.base.Engine {'id': 1, 'name': 'test'}
2019-10-16 14:01:59,457 INFO sqlalchemy.engine.base.Engine INSERT INTO child (id) VALUES (:id)
2019-10-16 14:01:59,457 INFO sqlalchemy.engine.base.Engine {'id': 1}

Also, I'm interested in the value.

  {
    "name": "hgvs_p",
    "value": {
      "doubleValue": "nan"
    }
  },

nan may be invalid value.

koxudaxi · 2019-10-16T14:15:39Z

@Rubyj
I create a Postgres Aurora Cluster And deploy an app with sam. It works fine.
Also, I test bulk-insert on simple relation table. It's OK.

If you can show me DDL for creating table and SQLAlchemy's Table classes then, I can try it.

Rubyj · 2019-10-16T14:22:14Z

@koxudaxi

Also, I'm interested in the value.
  {
    "name": "hgvs_p",
    "value": {
      "doubleValue": "nan"
    }
  },
nan may be invalid value.

This is strange. I am not sure why it is described as doubleValue. That columns is defined as:

hgvs_p = Column(String(100))

@Rubyj
I create a Postgres Aurora Cluster And deploy an app with sam. It works fine.
Also, I test bulk-insert on simple relation table. It's OK.

If you can show me DDL for creating table and SQLAlchemy's Table classes then, I can try it.

Yes. Here they are:

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import relationship
from sqlalchemy import Column, Integer, String, Float, ForeignKey

Base = declarative_base()


class Variant(Base):
    __tablename__ = 'variant'

    id = Column(Integer, primary_key=True)
    chrom = Column(String(5))
    pos = Column(String(20))
    rs_id = Column(String(50))
    ref = Column(String(50))
    alt = Column(String(50))
    qual = Column(Float)


class VariantAnnotation(Base):
    __tablename__ = 'variant_annotation'

    id = Column(Integer, primary_key=True)
    variant_id = Column(Integer, ForeignKey('variant.id'))
    variant = relationship(Variant, backref='annotations')
    allele = Column(String(50))
    annotation = Column(String(50))
    annotation_impact = Column(String(50))
    gene_name = Column(String(50))
    gene_id = Column(String(50))
    hgvs_c = Column(String(100))
    hgvs_p = Column(String(100))
    reference = Column(String(500))

What is the best way to get DDL from serverless? I took a look at the table information and it matched these models:

table_catalog	table_schema	table_name	column_name	ordinal_position	column_default	is_nullable	data_type	character_maximum_length
postgres	public	variant_annotation	id	1	nextval('variant_annotation_id_seq'::regclass)	NO	integer	NULL
postgres	public	variant_annotation	variant_id	2	NULL	YES	integer	NULL
postgres	public	variant_annotation	allele	3	NULL	YES	character varying	50
postgres	public	variant_annotation	annotation	4	NULL	YES	character varying	50
postgres	public	variant_annotation	annotation_impact	5	NULL	YES	character varying	50
postgres	public	variant_annotation	gene_name	6	NULL	YES	character varying	50
postgres	public	variant_annotation	gene_id	7	NULL	YES	character varying	50
postgres	public	variant_annotation	hgvs_c	8	NULL	YES	character varying	100
postgres	public	variant_annotation	hgvs_p	9	NULL	YES	character varying	100
postgres	public	variant_annotation	reference	10	NULL	YES	character varying	500

koxudaxi · 2019-10-16T14:31:27Z

@Rubyj
Thank you for showing me, classes.

What is the best way to get DDL from serverless?

Sorry, I don't know.
Did you create a table? from the SQLSlchemy classes? or DDL?
I want to create tables in the same way.

This is strange. I am not sure why it is described as doubleValue. That columns is defined as:
hgvs_p = Column(String(100))

It may be bugs 😖

PS. I just created tables from SQLAlchemy classes.

Rubyj · 2019-10-16T14:39:53Z

Did you create a table? from the SQLSlchemy classes? or DDL?
I want to create tables in the same way.

Yes, I created the tables using alembic.

After creating models.

pip install alembic
alembic init alembic --- in project directory

alembic.ini

sqlalchemy.url = postgresql+pydataapi://

alembic/env.py - edit run_migrations_online() to use create_engine

def run_migrations_online():
    """Run migrations in 'online' mode.

    In this scenario we need to create an Engine
    and associate a connection with the context.

    """
    aws_args = {'resource_arn': 'arn...',
                'secret_arn': 'arn...',
                'database': 'postgres'}
    url = config.get_main_option("sqlalchemy.url")
    connectable = create_engine(
        url,
        connect_args=aws_args,
        poolclass=pool.NullPool
    )
...

alembic revision --autogenerate -m "Added tables"
alembic upgrade head

koxudaxi · 2019-10-16T14:54:07Z

alembic dump an error.

  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/re.py", line 183, in search
    return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object

I create all the tables in a simple way.

Base.metadata.create_all(bind=engine)

It works.

I try to build-insert

I run the lines

variant_object = Variant(alt="test")
variant_annotation = VariantAnnotation(variant=variant_object)
s.bulk_save_objects([variant_object])
s.bulk_save_objects([variant_annotation])
s.commit()

result: ResultProxy = engine.execute(Select([Variant]))
print(result.fetchall())

output

[(2, True, True, True, True, 'test', True)]

Rubyj · 2019-10-16T15:00:22Z

alembic dump an error.

  File "/usr/local/Cellar/python/3.7.3/Frameworks/Python.framework/Versions/3.7/lib/python3.7/re.py", line 183, in search
    return _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object

I create all the tables in a simple way.

Base.metadata.create_all(bind=engine)

It works.

I try to build-insert

Not sure about that error. What does your alembic setup look like? Ok I will also try to look at my models and make length longer. May have error there.

Rubyj · 2019-10-16T15:07:17Z

output

[(2, True, True, True, True, 'test', True)]

why are there so many boolean values? Hmm maybe this is a problem with my data then. I will need to investigate further why this thinks my string is a double. I just made the inserts work with the text INSERT INTO statements from the code.

koxudaxi · 2019-10-16T15:11:18Z

It's a bug!!

Real Values are

<class 'list'>: [{'longValue': 2}, {'isNull': True}, {'isNull': True}, {'isNull': True}, {'isNull': True}, {'stringValue': 'test'}, {'isNull': True}]

Its should be None
I'm fixing the part 😉

Rubyj · 2019-10-16T15:17:10Z

It's a bug!!

Real Values are

<class 'list'>: [{'longValue': 2}, {'isNull': True}, {'isNull': True}, {'isNull': True}, {'isNull': True}, {'stringValue': 'test'}, {'isNull': True}]

Its should be None
I'm fixing the part

aha! Although, I think this is a different bug 😅 I wonder if this is related to my nan being a double

Rubyj · 2019-10-16T15:23:05Z

@koxudaxi I tried increasing the length on my models but I still get the SerializationException. I wonder how I can debug this better or get a better error message.

koxudaxi · 2019-10-16T15:26:24Z

@Rubyj
I set a break point to debug with PyCharm on a local machine.
Could you do it?

Rubyj · 2019-10-16T15:27:16Z

@Rubyj
I set a break point to debug with PyCharm on a local machine.
Could you do it?

Yes, I can. I have been trying to set breakpoint, but I do not know the correct place to do so to get a good error message.

koxudaxi · 2019-10-16T15:31:57Z

Here !! for checking request of batch_execute
sql is raw sql.

Rubyj · 2019-10-16T15:32:20Z

@koxudaxi I got the logging to work!!!

I think nan is indeed the issue. I see this:

'hgvs_p': nan, In the insert statement. It is supposed to be a string. I think I will need to clean my data. For some reason, pandas is treating no value as {float} nan

koxudaxi · 2019-10-16T15:41:06Z

Should clean up data by pydataapi?

We should fix the converting method.

def convert_value(value: Any) -> Dict[str, Any]:
    if isinstance(value, bool):
        return {'booleanValue': value}
    elif isinstance(value, str):
        return {'stringValue': value}
    elif isinstance(value, int):
        return {'longValue': value}
    elif isinstance(value, float):
        return {'doubleValue': value}
    elif isinstance(value, bytes):
        return {'blobValue': value}
    elif value is None:
        return {'isNull': True}
    else:
        raise Exception(f'unsupported type {type(value)}: {value} ')

Rubyj · 2019-10-16T15:44:57Z

Should clean up data by pydataapi?

We should fix the converting method.

def convert_value(value: Any) -> Dict[str, Any]:
    if isinstance(value, bool):
        return {'booleanValue': value}
    elif isinstance(value, str):
        return {'stringValue': value}
    elif isinstance(value, int):
        return {'longValue': value}
    elif isinstance(value, float):
        return {'doubleValue': value}
    elif isinstance(value, bytes):
        return {'blobValue': value}
    elif value is None:
        return {'isNull': True}
    else:
        raise Exception(f'unsupported type {type(value)}: {value} ')

It is working now!!! But It is taking a little while because I have around 100,000 models to insert. The problem was I had numpy.nan in my data. numpy.nan is considered as a float for some reason. I replaced all numpy.nan with None and the Serialization problem went away

Sorry for the trouble!

koxudaxi · 2019-10-16T15:50:38Z

@Rubyj
Should we handle numpy.nan? and special cases?
I don't know who should handle the special values.
users? or SQLAlchemy driver?

Rubyj · 2019-10-16T15:51:54Z

I think it is OK for user to handle this special vale. Otherwise, numpy becomes a dependency of pydataapi.

My inserts have been running for over 5 minutes 🤦‍♂️

koxudaxi · 2019-10-16T15:58:18Z

@Rubyj
Thank you for your advice.
OK, The comments are too long. I just merged.

We could fix a lot of bugs in this PR 🎉
I appreciate it very much 😄

Rubyj · 2019-10-16T15:58:56Z

@koxudaxi Ofc ourse!! Thank YOU!

One last thing. Do you know if bulk_insert_mappings() will work with pydataapi and foreign keys instead of bulk_save_objects()?

koxudaxi · 2019-10-16T16:05:16Z

Sorry, I don't know it. However, It works bulk_insert_mappings() for a single table.

2019-10-17 01:03:05,610 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-10-17 01:03:05,634 INFO sqlalchemy.engine.base.Engine INSERT INTO parent (id, name) VALUES (:id, :name)
2019-10-17 01:03:05,634 INFO sqlalchemy.engine.base.Engine {'id': 3, 'name': 'test'}
2019-10-17 01:03:05,667 INFO sqlalchemy.engine.base.Engine COMMIT

Rubyj · 2019-10-16T16:07:03Z

Sorry, I don't know it. However, It work bulk_insert_mappings() for a single table.

2019-10-17 01:03:05,610 INFO sqlalchemy.engine.base.Engine BEGIN (implicit)
2019-10-17 01:03:05,634 INFO sqlalchemy.engine.base.Engine INSERT INTO parent (id, name) VALUES (:id, :name)
2019-10-17 01:03:05,634 INFO sqlalchemy.engine.base.Engine {'id': 3, 'name': 'test'}
2019-10-17 01:03:05,667 INFO sqlalchemy.engine.base.Engine COMMIT

Ok I am interested to see if it works for relationship. I can look.

koxudaxi · 2019-10-16T16:17:52Z

I have released version 0.4.4 🎉

fix default_paramstyle

2267650

fix validator

98bc70e

support lastrowid

ec31afc

support null value

f6e9425

fix type hint

f68ed18

koxudaxi merged commit 210c1e9 into master Oct 16, 2019

koxudaxi deleted the fix_default_paramstyle branch October 16, 2019 15:58

fix default_paramstyle #27

fix default_paramstyle #27

Conversation

koxudaxi commented Oct 15, 2019

Related Issue

codecov bot commented Oct 15, 2019 • edited Loading

Codecov Report

Rubyj commented Oct 15, 2019

koxudaxi commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 15, 2019 • edited Loading

Rubyj commented Oct 15, 2019

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019 • edited Loading

Rubyj commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 15, 2019

Rubyj commented Oct 15, 2019 • edited Loading

koxudaxi commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019 • edited Loading

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019 • edited Loading

Rubyj commented Oct 16, 2019 • edited Loading

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019

Rubyj commented Oct 16, 2019

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019 • edited Loading

Rubyj commented Oct 16, 2019 • edited Loading

koxudaxi commented Oct 16, 2019

codecov bot commented Oct 15, 2019 •

edited

Loading

koxudaxi commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

koxudaxi commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

Rubyj commented Oct 15, 2019 •

edited

Loading

koxudaxi commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

koxudaxi commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

koxudaxi commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading

koxudaxi commented Oct 16, 2019 •

edited

Loading

Rubyj commented Oct 16, 2019 •

edited

Loading