[INSTRM-890] Decide on mechanism for saving unsigned integers to SQL. Created: 28/Jan/20 Updated: 31/Oct/20 |
|
Status: | Open |
Project: | Instrument control development |
Component/s: | None |
Affects Version/s: | None |
Fix Version/s: | None |
Type: | Task | Priority: | Normal |
Reporter: | cloomis | Assignee: | Unassigned |
Resolution: | Unresolved | Votes: | 0 |
Labels: | opDB | ||
Remaining Estimate: | Not Specified | ||
Time Spent: | Not Specified | ||
Original Estimate: | Not Specified |
Issue Links: |
|
Description |
Postgres does not provide unsigned integer types (nor does the SQL standard). At least one of our column types (for pfsDesignIds, etc.) requires 64-bit unsigned ints, which do not have a larger type to upgrade to. Besides that, sqlAlchemy has no existing mechanism for synthesizing a new type. There is a pguint extension which people seem to use. Or we could convert signed<->unsigned above the SQL/sqlAlchemy layer. Others? MySQL has unsigned types, FWIW. This is not a problem for FITS. For tables, TSCAL and TZERO do the conversion, and for header cards the content of the integer field is (surprisingly) not constrained. Kiyoto Yabe? rhl? price? |
Comments |
Comment by Kiyoto Yabe [ 28/Jan/20 ] |
I was proposing to use `NUMERIC` instead but this may not be a good idea because of performance etc. |
Comment by cloomis [ 28/Jan/20 ] |
I can't believe performance would really matter: we do not operate on the values besides = and maybe !=. But sqlalchemy wants the python side of Numerics to be a float or a python Decimal, neither of which is directly useable. If we need to subclass or otherwise add sqlalchemy conversion/cast logic, I think we would be better off doing that for their Integer or BigInt types, where we could easily convert between unsigned and signed values. That said, I do not actually know the right way do do this. Maybe something in https://docs.sqlalchemy.org/en/13/core/custom_types.html#creating-new-types maybe, with signed = unsigned ^ (1<<63) as the cast? |
Comment by Kiyoto Yabe [ 30/Oct/20 ] |
I tested the Postgres extension pguint in my local Linux machine with PostgreSQL 10.14. Although the installation was OK, I got unknown segfault when I put any value into a table with `uint8` datatype. The repository seems to be maintained by only one person and I'm just worried about using that unstable extension in opDB. I also tested a signed - unsigned conversion in sqlalchemy side while the datatype of the column is `bigint` implementing `model.py` based on the link Craig suggested like this:
class UnsignedBigInteger(TypeDecorator): impl = BigInteger def __init__(self, *args, **kwargs): TypeDecorator.__init__(self, *args, **kwargs) def process_bind_param(self, value, dialect): if value is not None: return value - (1 << 63) def process_result_value(self, value, dialect): if value is not None: return value + (1 << 63) class pfs_visit(Base): '''Tracks the Gen2 visit identifier. This is the fundamental identifier for all instrument exposures (MCS, AGC, SPS) ''' __tablename__ = 'pfs_visit' pfs_visit_id = Column(Integer, primary_key=True, unique=True, autoincrement=False) pfs_visit_description = Column(String) pfs_design_id = Column(UnsignedBigInteger) def __init__(self, pfs_visit_id, pfs_visit_description, pfs_design_id): self.pfs_visit_id = pfs_visit_id self.pfs_visit_description = pfs_visit_description self.pfs_design_id = pfs_design_id I think this works well. I can insert pfsDesignId=1 like
utils.insert(url, 'pfs_visit', pd.DataFrame({'pfs_visit_id': [120981, 120982, 120983], 'pfs_visit_description': ['test', 'test', 'test'], 'pfs_design_id': [pfsDesignId, pfsDesignId, pfsDesignId]}))
and see
develop=# select * from pfs_visit; pfs_visit_id | pfs_visit_description | pfs_design_id --------------+-----------------------+---------------------- 120981 | test | -9223372036854775807 120982 | test | -9223372036854775807 120983 | test | -9223372036854775807 (3 rows)
I fetch the table like
res = utils.fetch_all(url, 'pfs_visit')
print(res)
and see pfs_visit_id pfs_visit_description pfs_design_id 0 120981 test 1 1 120982 test 1 2 120983 test 1 So I think, as long as we use opDB package, we can treat unsigned bigint without caring about the conversion. We need to worry about that if we want access opDB directly. What do you think, cloomis ?
|
Comment by cloomis [ 31/Oct/20 ] |
I was hoping that (using signed integers inside the db, and converting simply on select/insert) would work. The bits are the same and should not change if we do that right. So yes, I like that. |