SQLAlchemy Declarative Class Reflector

3.00 avg. rating (63% score) - 5 votes

SQLAlchemy has a nice reflection facility which takes for example a database table name as argument and produces a Table object out of it for manipulation. However, those objects do not behave like the objects produced by declarative classes, which are easier to work with. Here’s a little class that helps to bridge that gap by reflecting proper declarative classes from database tables. It has only been tested with PostgreSQL, but it may work with other databases as well.

The class SQLReflector requires the preconfigured SQLAlchemy database engine as argument. Example (try in IPython):

# Make a connection string
import getpass
password = getpass.getpass()
connstr = "postgresql+psycopg2://username:%s@localhost/dbname" % password

# Create an engine
from sqlalchemy import create_engine
engine = create_engine(connstr)

# Create the SQLReflector object using the engine
from sqlreflector import SQLReflector
reflector = SQLReflector(engine)

# Start a session (you can use sqlreflector.session as well)
from sqlalchemy.orm import sessionmaker
sessionmaker = sessionmaker(bind=engine)
session = sessionmaker()

# Make a connection string

import getpass

password = getpass.getpass()

connstr = "postgresql+psycopg2://username:%s@localhost/dbname" % password

# Create an engine

from sqlalchemy import create_engine

engine = create_engine(connstr)

# Create the SQLReflector object using the engine

from sqlreflector import SQLReflector

reflector = SQLReflector(engine)

# Start a session (you can use sqlreflector.session as well)

from sqlalchemy.orm import sessionmaker

sessionmaker = sessionmaker(bind=engine)

session = sessionmaker()

The instantiated SQLReflector object can be used to reflect database objects using the following methods:

reflect_table(schema_name, table_name): reflects a single table into a class
reflect_tables(schema_name, table_name_list): reflects multiple tables at once from one schema
reflect_schema(schema_name): reflects all tables and sequences in the given schema
reflect_database(): reflects everything from a database

So to reflect the table “my_table” in schema “public”, write:

MyTable = reflector.reflect_table("public", "my_table")

1	MyTable = reflector.reflect_table("public", "my_table")

Now you can query it like any declarative class:

myrow = session.query(MyTable).filter(MyTable.id == 1).one()

1	myrow = session.query(MyTable).filter(MyTable.id == 1).one()

The reflect_table method accepts a column_definitions argument. You can use that to define missing stuff, such as primary keys for views, or foreign key relationships not defined in the database.

For example PostgreSQL views don’t have primary keys, which causes problems with SQLAlchemy. To define a primary key while reflecting the class, you can write something like this:

from sqlalchemy import Column, Integer
MyView  = reflector.reflect_table(
              'my_schema',
              'my_view',
              column_definitions=
              (
                  Column('id_col', Integer, primary_key=True),
              )
          )

from sqlalchemy import Column, Integer

MyView = reflector.reflect_table(

'my_schema',

'my_view',

column_definitions=

(

Column('id_col', Integer, primary_key=True),

)

To define an explicit foreign key, you can write something like:

from sqlalchemy import Column, Integer, ForeignKey
MyTable = reflector.reflect_class(
              'my_schema',
              'my_table',
              column_definitions=
              (
                  Column(
                      'fk_col',
                      Integer,
                      ForeignKey('foreign_schema.foreign_table.foreign_column')
                  )
              )
          )

from sqlalchemy import Column, Integer, ForeignKey

MyTable = reflector.reflect_class(

'my_schema',

'my_table',

column_definitions=

(

Column(

'fk_col',

Integer,

ForeignKey('foreign_schema.foreign_table.foreign_column')

)

You can find all the tables you have reflected in the SQLReflector instance, under the classes property. Every schema you have used is there, and under each schema every table you have reflected. So if you run reflector.reflect_database(), you should find all you schemas under reflector.classes, and every table like this:

reflector.classes.<schema_name>.<class_name>

But remember, if you have any views defined in a PostgreSQL database, the reflect_database() call will fail, because you must explicitly add a primary key constraint to every view to please SQLAlchemy.

There are a couple of helper methods to snoop around the database and see what’s there. The method get_schema_names() will give a list of schema names in the database. The method get_table_names(schema_name) will give a list of table names in a given schema. There is a full sqlalchemy.engine.reflection.Inspector object in the instantiated SQLReflector object (the class member name is, not surprisingly, “inspector”). You can use that to further inspect the database.

SQL objects can have names containing for example accented letters and special characters. Python 2 allows only letters, numbers and underscores in identifier names. Python 3 allows some unicode characters in identifiers. For compatibility’s sake, SQLReflector mangles SQL identifiers so that the resulting Python identifiers contain only letters, numbers and underscores.

By default, table names are converted into CamelCase in such a way, that the name is converted into lowercase, and after that, underscore characters in table names are removed, and the character following the underscore character is changed into uppercase. Also first character is uppercased.

For example:

Table called “my_data” becomes a class called “MyData”
Table called “MyData” becomes a class called “Mydata”

If you don’t like this behaviour, you can disable it by giving a “camelcase=False” argument to the constructor. In this case, the schema name sanitizer is used for table names.

Schema names are not converted into CamelCase, but are sanitized from unwanted characters.

You can disable all sanitization by giving the class constructor a “sanitize_names=False” argument. In this case the SQL name is used as the class name whatever it is, possibly resulting in exceptions if the class name contains characters Python does not accept.

Here’s the code (sqlreflector.py):

# Author Mikko Kortelainen <mikko.kortelainen@techelp.fi>
# Tested (not very thoroughly) with Python 2.7.5 and 3.3.2, and SQLAlchemy 0.8.2

import sqlalchemy
from sqlalchemy import create_engine, MetaData, Table, Sequence
from sqlalchemy.orm import sessionmaker
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.engine import reflection

class SQLReflector(object):
    """Reflects SQLAlchemy declarative classes from database.

    Requires the SQLAlchemy database engine as argument. Tested only with
    PostgreSQL. For example:

    from sqlalchemy import create_engine
    connstr = "postgresql+psycopg2://user:pass@host/database?sslmode=require"
    engine = create_engine(connstr)
    reflector = SQLReflector(engine)
    session = reflector.session

    MyTable = reflector.reflect("my_schema", "my_table")
    a_row = session.query(MyTable).filter(MyTable.id == 1).one()

    After a table has been reflected, it is saved in the SQLReflector object.
    You can find them under the classes property:

    reflector.classes.<schemaname>.<tablename>

    By default schema and table names are sanitized so that characters which
    are not letters, numbers or underscores will be stripped from the name.
    Also, initial numeric characters will be stripped. What's left is treated
    as the identifier. For table names (not schema names), the default is
    also to convert the resulting name into CamelCase by first converting
    it to lowercase and after that removing all underscore characters and
    converting the character following the underscore to uppercase. For
    example:

      Table "my_data" becomes class "MyData"
      Table "MyData" becomes class "Mydata"

    You can turn CamelCase conversion off by giving camelcase=False.

    If you give sanitize_names=False, the names are not sanitized at all
    before making Python identifiers out of them. So your schemas and tables
    should have names that are compliant already.
    """

    def __init__(self, database_engine, sanitize_names=True, camelcase=True):
        self.engine = database_engine
        self.sessionmaker = sessionmaker(bind=self.engine)
        self.session = self.sessionmaker()
        self.metadata = MetaData(bind=self.engine)
        self.inspector = reflection.Inspector.from_engine(self.engine)

        self.classes = self.make_empty_object()
        self.sanitize_names = sanitize_names
        self.camelcase = camelcase

    def __repr__(self):
        """Formal representation"""
        return """<SQLReflector sanitize_names=%s camelcase=%s>""" % (
            "True" if self.sanitize_names else "False",
            "True" if self.camelcase else "False")

    def get_schema_names(self):
        """Returns a list of available schemas in database"""
        return self.inspector.get_schema_names()

    def get_table_names(self, schema_name):
        """Returns a list of available tables in a schema.

        For more in-depth exploration you can use the SQLAlchemy reflection.Inspector
        instance inside the class. The member is called "inspector". """
        return self.inspector.get_table_names(schema=schema_name)

    def reflect_table(self, schema_name, table_name, column_definitions=None, \
                            sanitize_names=None, camelcase=None):
        """Reflects a table's metadata and creates an SQLAlchemy declarative
        class with a nice sanitized CamelCase name out of it.

        You may give extra column definitions or pretty much anything you want
        to add to the class in the column_definitions argument. For example to
        define an integer primary key to a PostgreSQL view:

        from sqlalchemy import Column, Integer
        MyView  = reflector.reflect_table(
                      'my_schema',
                      'my_view',
                      column_definitions=
                      (
                          Column('id_col', Integer, primary_key=True),
                      )
                  )

        Or you can define an explicit foreign key if such is not in place in the
        database:

        from sqlalchemy import Column, Integer, ForeignKey
        MyTable = reflector.reflect_class(
                      'my_schema',
                      'my_table',
                      column_definitions=
                      (
                          Column(
                              'fk_col',
                              Integer,
                              ForeignKey('foreign_schema.foreign_table.foreign_column')
                          )
                      )
                  )

        """

        # Sanitize names if requested
        if camelcase is None: camelcase = self.camelcase
        if sanitize_names is None: sanitize_names = self.sanitize_names

        if sanitize_names:
            class_name = self.sanitize_tablename(table_name, camelcase)
            sane_schema_name = self.sanitize_schemaname(schema_name)
        else:
            class_name = table_name
            sane_schema_name = schema_name

        # Make a class from declarative_base()
        Base = declarative_base()
        class_bases = (Base,)

        if column_definitions is not None:
            # Add extra column_definitions argument as part of the new class
            table_definition = Table(
                table_name,
                self.metadata,
                *column_definitions,
                schema=schema_name,
                autoload=True)

        else:
            # No extra definitions
            table_definition = Table(
                table_name,
                self.metadata,
                schema=schema_name,
                autoload=True)

        # Make a proper class out of it
        the_class = type(class_name, class_bases, dict(__table__ = table_definition))

        # Save it as self.classes.<sane_schema_name>.<class_name>
        if not hasattr(self.classes, sane_schema_name):
            setattr(self.classes, sane_schema_name, self.make_empty_object())

        setattr(getattr(self.classes, sane_schema_name), class_name, the_class)

        # Also return it
        return the_class

    def reflect_tables(self, schema_name, table_name_list, \
                             sanitize_names=None, camelcase=None):
        """Makes a list of classes from a list of table names that are all
        defined in the give schema"""
        return [self.reflect_table(schema_name, x, sanitize_names, camelcase) \
                for x in table_name_list]

    def reflect_schema(self, schema_name, sanitize_names=None, camelcase=None):
        """Reflects all tables in a given schema"""
        table_names = self.inspector.get_table_names(schema=schema_name)
        return self.reflect_tables(schema_name, table_names, sanitize_names, camelcase)

    def reflect_database(self, sanitize_names=None, camelcase=None):
        """Reflects all tables in every schema of the database"""
        schema_names = self.inspector.get_schema_names()
        classes = list()
        for schema_name in schema_names:
            classes.extend(self.reflect_schema(schema_name, sanitize_names, camelcase))
        return classes

    def make_empty_object(self):
        """Create an empty object. It is a subclass of object, with the
        distinction, that this one can have its attributes manipulated with
        setattr"""
        EmptyObject = type('EmptyObject', (object,), dict())
        return EmptyObject()

    def underscore_to_camelcase(self, value):
        def camelcase():
            yield str.lower
            while True:
                yield str.capitalize

        c = camelcase()
        return "".join(next(c)(x) if x else '_' for x in value.split("_"))

    def capitalize_first_letter(self, string):
        return string[0].upper() + string[1:]

    def to_unicode(self, thing):
        """For Python 2 and 3 compatibility"""
        import sys
        if sys.version < '3':
            return unicode(thing)
        else:
            return str(thing)

    def to_string(self, thing):
        return str(thing)

    def sanitize_tablename(self, tablename, camelcase=True):
        """Returns a suitable class name derived from tablename
        - accents will be removed from letters
        - other than valid characters will be removed: [^A-Za-z0-9_]
        - underscores will be converted into CamelCase unless camelcase=False
        - initial numeric characters will be stripped
        - first character will be capitalized
        """
        from unicodedata import normalize
        from re import sub

        # Substitute accents and convert to ASCII
        normal = self.to_string(normalize("NFKD", self.to_unicode(tablename)).encode("ASCII", "ignore").decode())

        # Remove any unwanted characters
        new_normal = sub("[^A-Za-z0-9_]+", "", normal)

        # Convert underscore_identifiers to CamelCase
        if camelcase:
            camel = self.underscore_to_camelcase(new_normal)
        else:
            camel = new_normal

        # Remove initial numeric characters (class names cannot start with a
        # number)
        start_alpha = sub("^[0-9]+", "", camel)

        # Capitalize the first letter (only if CamelCase)
        if camelcase:
            classname = self.capitalize_first_letter(start_alpha)
        else:
            classname = start_alpha

        return classname

    def sanitize_schemaname(self, schemaname):
        """This sanitizes a schema name so that it can be used as a Python
        object.
        - accents will be removed from letters
        - other than valid characters will be removed: [^A-Za-z0-9_]
        - initial numeric characters will be stripped (must be A-Za-z_)
        """
        from unicodedata import normalize
        from re import sub

        # Substitute accents and convert to ASCII
        normal = self.to_string(normalize("NFKD", self.to_unicode(schemaname)).encode("ASCII", "ignore").decode())

        # Remove any unwanted characters
        new_normal = sub("[^A-Za-z0-9_]+", "", normal)

        # Remove initial numeric characters (class names cannot start with a
        # number)
        schema_name = sub("^[0-9]+", "", new_normal)

        return schema_name

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

# Author Mikko Kortelainen <mikko.kortelainen@techelp.fi>

# Tested (not very thoroughly) with Python 2.7.5 and 3.3.2, and SQLAlchemy 0.8.2

import sqlalchemy

from sqlalchemy import create_engine, MetaData, Table, Sequence

from sqlalchemy.orm import sessionmaker

from sqlalchemy.ext.declarative import declarative_base

from sqlalchemy.engine import reflection

class SQLReflector(object):

"""Reflects SQLAlchemy declarative classes from database.

Requires the SQLAlchemy database engine as argument. Tested only with

PostgreSQL. For example:

from sqlalchemy import create_engine

connstr = "postgresql+psycopg2://user:pass@host/database?sslmode=require"

engine = create_engine(connstr)

reflector = SQLReflector(engine)

session = reflector.session

MyTable = reflector.reflect("my_schema", "my_table")

a_row = session.query(MyTable).filter(MyTable.id == 1).one()

After a table has been reflected, it is saved in the SQLReflector object.

You can find them under the classes property:

reflector.classes.<schemaname>.<tablename>

By default schema and table names are sanitized so that characters which

are not letters, numbers or underscores will be stripped from the name.

Also, initial numeric characters will be stripped. What's left is treated

as the identifier. For table names (not schema names), the default is

also to convert the resulting name into CamelCase by first converting

it to lowercase and after that removing all underscore characters and

converting the character following the underscore to uppercase. For

example:

Table "my_data" becomes class "MyData"

Table "MyData" becomes class "Mydata"

You can turn CamelCase conversion off by giving camelcase=False.

If you give sanitize_names=False, the names are not sanitized at all

before making Python identifiers out of them. So your schemas and tables

should have names that are compliant already.

"""

def __init__(self, database_engine, sanitize_names=True, camelcase=True):

self.engine = database_engine

self.sessionmaker = sessionmaker(bind=self.engine)

self.session = self.sessionmaker()

self.metadata = MetaData(bind=self.engine)

self.inspector = reflection.Inspector.from_engine(self.engine)

self.classes = self.make_empty_object()

self.sanitize_names = sanitize_names

self.camelcase = camelcase

def __repr__(self):

"""Formal representation"""

return """<SQLReflector sanitize_names=%s camelcase=%s>""" % (

"True" if self.sanitize_names else "False",

"True" if self.camelcase else "False")

def get_schema_names(self):

"""Returns a list of available schemas in database"""

return self.inspector.get_schema_names()

def get_table_names(self, schema_name):

"""Returns a list of available tables in a schema.

For more in-depth exploration you can use the SQLAlchemy reflection.Inspector

instance inside the class. The member is called "inspector". """

return self.inspector.get_table_names(schema=schema_name)

def reflect_table(self, schema_name, table_name, column_definitions=None, \

sanitize_names=None, camelcase=None):

"""Reflects a table's metadata and creates an SQLAlchemy declarative

class with a nice sanitized CamelCase name out of it.

You may give extra column definitions or pretty much anything you want

to add to the class in the column_definitions argument. For example to

define an integer primary key to a PostgreSQL view:

from sqlalchemy import Column, Integer

MyView = reflector.reflect_table(

'my_schema',

'my_view',

column_definitions=

(

Column('id_col', Integer, primary_key=True),

)

Or you can define an explicit foreign key if such is not in place in the

database:

from sqlalchemy import Column, Integer, ForeignKey

MyTable = reflector.reflect_class(

'my_schema',

'my_table',

column_definitions=

(

Column(

'fk_col',

Integer,

ForeignKey('foreign_schema.foreign_table.foreign_column')

)

"""

# Sanitize names if requested

if camelcase is None: camelcase = self.camelcase

if sanitize_names is None: sanitize_names = self.sanitize_names

if sanitize_names:

class_name = self.sanitize_tablename(table_name, camelcase)

sane_schema_name = self.sanitize_schemaname(schema_name)

else:

class_name = table_name

sane_schema_name = schema_name

# Make a class from declarative_base()

Base = declarative_base()

class_bases = (Base,)

if column_definitions is not None:

# Add extra column_definitions argument as part of the new class

table_definition = Table(

table_name,

self.metadata,

*column_definitions,

schema=schema_name,

autoload=True)

else:

# No extra definitions

table_definition = Table(

table_name,

self.metadata,

schema=schema_name,

autoload=True)

# Make a proper class out of it

the_class = type(class_name, class_bases, dict(__table__ = table_definition))

# Save it as self.classes.<sane_schema_name>.<class_name>

if not hasattr(self.classes, sane_schema_name):

setattr(self.classes, sane_schema_name, self.make_empty_object())

setattr(getattr(self.classes, sane_schema_name), class_name, the_class)

# Also return it

return the_class

def reflect_tables(self, schema_name, table_name_list, \

sanitize_names=None, camelcase=None):

"""Makes a list of classes from a list of table names that are all

defined in the give schema"""

return [self.reflect_table(schema_name, x, sanitize_names, camelcase) \

for x in table_name_list]

def reflect_schema(self, schema_name, sanitize_names=None, camelcase=None):

"""Reflects all tables in a given schema"""

table_names = self.inspector.get_table_names(schema=schema_name)

return self.reflect_tables(schema_name, table_names, sanitize_names, camelcase)

def reflect_database(self, sanitize_names=None, camelcase=None):

"""Reflects all tables in every schema of the database"""

schema_names = self.inspector.get_schema_names()

classes = list()

for schema_name in schema_names:

classes.extend(self.reflect_schema(schema_name, sanitize_names, camelcase))

return classes

def make_empty_object(self):

"""Create an empty object. It is a subclass of object, with the

distinction, that this one can have its attributes manipulated with

setattr"""

EmptyObject = type('EmptyObject', (object,), dict())

return EmptyObject()

def underscore_to_camelcase(self, value):

def camelcase():

yield str.lower

while True:

yield str.capitalize

c = camelcase()

return "".join(next(c)(x) if x else '_' for x in value.split("_"))

def capitalize_first_letter(self, string):

return string[0].upper() + string[1:]

def to_unicode(self, thing):

"""For Python 2 and 3 compatibility"""

import sys

if sys.version < '3':

return unicode(thing)

else:

return str(thing)

def to_string(self, thing):

return str(thing)

def sanitize_tablename(self, tablename, camelcase=True):

"""Returns a suitable class name derived from tablename

- accents will be removed from letters

- other than valid characters will be removed: [^A-Za-z0-9_]

- underscores will be converted into CamelCase unless camelcase=False

- initial numeric characters will be stripped

- first character will be capitalized

"""

from unicodedata import normalize

from re import sub

# Substitute accents and convert to ASCII

normal = self.to_string(normalize("NFKD", self.to_unicode(tablename)).encode("ASCII", "ignore").decode())

# Remove any unwanted characters

new_normal = sub("[^A-Za-z0-9_]+", "", normal)

# Convert underscore_identifiers to CamelCase

if camelcase:

camel = self.underscore_to_camelcase(new_normal)

else:

camel = new_normal

# Remove initial numeric characters (class names cannot start with a

# number)

start_alpha = sub("^[0-9]+", "", camel)

# Capitalize the first letter (only if CamelCase)

if camelcase:

classname = self.capitalize_first_letter(start_alpha)

else:

classname = start_alpha

return classname

def sanitize_schemaname(self, schemaname):

"""This sanitizes a schema name so that it can be used as a Python

object.

- accents will be removed from letters

- other than valid characters will be removed: [^A-Za-z0-9_]

- initial numeric characters will be stripped (must be A-Za-z_)

"""

from unicodedata import normalize

from re import sub

# Substitute accents and convert to ASCII

normal = self.to_string(normalize("NFKD", self.to_unicode(schemaname)).encode("ASCII", "ignore").decode())

# Remove any unwanted characters

new_normal = sub("[^A-Za-z0-9_]+", "", normal)

# Remove initial numeric characters (class names cannot start with a

# number)

schema_name = sub("^[0-9]+", "", new_normal)

return schema_name

Leave No Bit Unturned

SQLAlchemy Declarative Class Reflector

Like this:

One thought on “SQLAlchemy Declarative Class Reflector”

Leave a Reply to Bijan Cancel reply

Share:

Like this:

One thought on “SQLAlchemy Declarative Class Reflector”

Leave a Reply to Bijan Cancel reply