Two questions, please.
1) I am reading about FEDERATED storage engine, but for me, is not clear that there are advantages in relation a simple remote connection. There is any difference or advantage ?
2) (real question) In this situation: If i have a MySQL DB and I need to access and read sensitive data from other database, probably with different DBMS, and probably, I only have access to read (anyway, I don’t need more privileges for the task).
Options
- FEDERATED storage engine only solve the issue in MySQL DBMS
- database abstraction library (pdo, zend)
- Build an API for each external database
- Sync my database with the others (maybe overkill for this propose)
What I need is just: “john is in your database? yes, no”
What’s the best choice ?
thanks!
It’s very hard to tell whether you’re trying to solve a technical problem or some kind of security/management problem. I’m going to address what I believe is your actual technical problem—determining if a user is in your database—and describe tradeoffs of each of your possibilities. Let’s address possible solutions in order of complexity.
1. Direct MySQL database connection
This would be to use a MySQL-specific database driver to connect directly to the remote database and issue your hard-coded SQL statement. This is fine for internal use if you can be sure that both ends have the same knowledge of the schema, if both ends are preferably on the same internal network, and you are responsible for all the clients that are going to connect and need this information.
2. PDO database connection
If you have agreement about the schema, but not the database vendor (i.e. one side may be using PostgreSQL instead of MySQL) then PDO is a better choice. As long as your queries are fairly simplistic you are unlikely to run into database compatibility issues. This is a better idea than #1 because it amounts to the same amount of work but is more flexible.
3. Database replication
I recommend extreme caution before deciding to use database replication. Replication is complex to set up and requires oversight and management. There is no such thing as a simple replication setup. However, it may be appropriate if:
If you need some or all of these features and are willing to pay the maintenance price, it may be the right choice. Bear in mind that it is a lot of work to deal with replication, and it will tie both sides to the same database vendor.
4. FEDERATED storage engine
I would recommend against using this for these reasons:
FEDERATEDonly works between MySQL databases, it introduces new kinds of failure without meaningfully simplifying your application codeI’m looking at this engine choice wondering what it could be good for. I suppose if you moved a table from one database to another but didn’t want to or couldn’t change the application code that queries it, it may be appropriate, but I would not consider this for an up-front design. It will be fragile without improving clarity or performance.
5. Write an API
All of the choices up to this assume that you can trust everyone who needs the information with database credentials. If you need to give access to this information to third-parties, an API is a good choice because:
APIs have disadvantages as well, though:
Further questions and notes
You mention that this is “sensitive” information. Let me point out that for PCI compliance and HIPAA compliance and other situations where you are dealing with private data protected by law, none of these options are appropriate, because they all will involve decrypting and sharing data across computers that should not have access to it. When you query a database on machine A from machine B, it is very difficult to be certain of exactly how many copies of the data you’ll have in memory. If this is your situation—private, encrypted, legally protected data—you will need to go to great lengths to ensure your solution is legally sound and I am not in a position to offer advice as to how to proceed, other than to say that the foregoing is insufficient.
If that is not the case, I would say the best solution for internal use in terms of efficiency and simplicity is to use PDO (#2). Otherwise, build an API (#5).