What are /r/Database's favorite Products & Services?
From 3.5 billion Reddit comments

If I understand your question you're asking how to go about writing/implementing/creating/programming from scratch a database engine, similar to postgress, Oracle, etc.

A database engine is a really big project with lots of high level components (network communication, interprocess communication, SQL parsing, query planing and execution, etc etc etc).

Probably the smallest database engine I know of is http://www.sqlite.org/. It's source code is likely (perhaps, maybe, you never know) the least complicated. If you spent enough time study it that should give you some ideas of how to proceed.

If I was going to design and build a database engine I'd start with design of the SQL language variant I'd want to support. This would require considerable knowledge of parsing and compiler construction. In and of itself I'd expect this to take up to several months just to specify the language. The building of a parser who's output could be used for a query planner/execution engine would take several (maybe many many) months after that. And this would still leave all the low level nitty gritty handling of disk storage/retrieval (and how to accomplish ACID transaction in the face of power failures etc). And this wouldn't be multi-user, no networking layer, and none of the really advanced features (replication, sharding, high availability, etc).

Think of the biggest model train (eg. http://www.telegraph.co.uk/news/picturegalleries/howaboutthat/5043783/The-worlds-biggest-model-train-set.html) and what you'd have to do to make literally everything from the ground up (eg. paint, glue, nails, rails, wire, glass, not just the beautiful models that are visible). It's that complicated.

So as a long term hobby, say 10, 15, or 20 years, it certainly would be possible to create a full fledged database engine comparable to postgress/oracle/sql server/etc.

sqlite is simple to work with, there are several user friendly interfaces like sqlite studio

I would definitely not recommend Microsoft Access, it's expensive, it's overly complicated and you learn very little of what databases really are about. And it doesn't easily expand to the web - (and that's only if you plan on using windows server for your web facing endeavors).

If you want a really good database with lots of growing potential you should go straight for postgres - it's the best free database and has excellent documentation.

You could use SQL Server Express Edition instead of LocalDB.

Edit: you could try to access LocalDB in Ms Access through ODBC.

Try this. It might take a little to get the hang of but once you understand it you'll be in good shape. You'll scrape the page into csv and then you can import it into whatever spreadsheet or db you want.

Agreed - this seems to be a logical impossibility.

You cannot create a department, because you cannot start with zero or 1 employees. And you cannot assign employees to a department that doesn't exist due to the prior requirement.

Which explains why there is no such symbol anyway.

So the only conclusion is that your DB management class professor has a Zero to One relationship between himself and intelligence (on this one issue.)

Since you're looking to learn how to use a database more than learn how to set one up, I'd suggest using SQLite3 for your actual database. SQLite is an open source embedded database, and support for it is (afaik) bundled in with a current release of php.

http://php.net/manual/en/book.sqlite.php

>I need to dig into something like the Paypal API, which hopefully simplifies the financial security part, thanks for the suggestion.

Have you seen Stripe? I've heard good things from a few friends but I have no idea if it fits your actual needs here.

> These are also called relational databases because you could link the tables.

This is a common misinterpretation of the word "relational" in this context. In fact it is based upon the mathematical concept of a relation; Ted Codd was a mathematician and he created relational algebra for modelling data.

https://www.digitalocean.com/community/tutorials/sqlite-vs-mysql-vs-postgresql-a-comparison-of-relational-database-management-systems

Just b/c I work in a MS shop I'd also throw MS SQL Server Express in the mix as well for a possibility.

Talend Data Integration Tool. It's free and supports almost all DB's going both ways. I've been using it for over a year. It's SOOOO much faster than all the other tools I've tried, including the paid ones. It's extremely flexible but I suppose with that flexibility comes a little price in manual set up where a lot of other tools will automatically map things for you. Check it out and if you have any other questions I'd be happy to help: http://www.talend.com/products/open-studio-di.php

A simple place to put the data to start with could be something like http://academictorrents.com

At least you could get the data out there and then see what people would do with it while you work to build out a usable interface.

Maybe this will work: https://www.metabase.com/

It's basically a human-friendly analytics interface to your database. You just download it (it's open source), connect it to your data, create visualizations with the provided builder or with SQL and then compose these visualizations into dashboards and share them. It's simple to get up and running and there's _tons_ of functionality.

For keeping versioning and doing create/update like you want take a look at liquibase, it can do what you want and you can keep the scripts in GIT, also if the application use maven/ant it can be integrated into it giving you the option to deploy application and DB changes at the same time. Actually liquibase does what you do, it keeps a table with the changes applied that's how it knows what change to apply or rollback and in what sequence.

But since you want to use ONE script to "rule them all" then you will have to stick with the XML format and liquibase "datatypes" so they can be translated "correctly" to Oracle or SQL Server. You will have to minimize the use of DB specific datatypes/features to avoid problems or maybe use the <precondictions> tag to execute specific sql scripts to a specific database, I've never use this feature so I don't know if it works.

Another option you want to look at it is Oracle SQL Developer Datamodeler. I've used it for Oracle only and it works well with it. The only issue is it uses subversion, I don't know if there's a plugin for git. The claim the DDL can be exported to SQL Server but I've never tried.

http://www.oracle.com/technetwork/developer-tools/datamodeler/datamodelerfaq-167683.html#exportimport >Export a DDL script file for Oracle, DB2, UDB and SQL Server. There is a DDL file editor wizard to help you in defining the set of objects and different options in generation. A compare/merge functionality allows two models to be compared to create the update Alter statements

Hi this is Alan from ArangoDB.

I agree that Foxx is a lot more complicated than it needs to be. This is why we've spent the past months completely rewriting the API for ArangoDB 3.0.

The upcoming major version will do away with models and repositories (just use the collections directly) and replace the controllers with routers, which are nestable and behave more like what JS developers might be used to from Node frameworks like Express.

Additionally we aim to put a stronger focus on backwards compatibility starting with 2.8. It will be possible to run 2.8 Foxx services on ArangoDB 3.0 and the 3.0 Foxx APIs will follow semver (i.e. remain backwards compatible until 4.0 rather than the previous deprecation policy).

We feel that the biggest advantage of "full-stack JavaScript" comes from allowing developers to move throughout the entire stack and making it easier to share knowledge in the team.

Additionally Foxx can allow some applications to drastically reduce the size of their existing application server by shifting most of the business logic closer to the database (avoiding unnecessary roundtrips or leaking implementation details of the database engine into the server's frontend). It also allows some novel approaches like handling GraphQL directly inside the database.

I encourage you to give Foxx another try when ArangoDB 3.0 comes out and would love to hear your feedback.

Hi this is Jan from ArangoDB.

Thanks for you kind words... just a little update on upgrades with v3.0 (release April 2016).

We´ll implement persistent indexes, automatic-failover and VelocyPack (own format for serializationa dn storage)... if you like, check out our Dev-Roadmap here: https://www.arangodb.com/roadmap/

OH!

In that case, look into LogStash. I'm not familiar with LogStash but the idea behind it is that it's supposed to take logs and put stash them into ElasticSearch.

I think you have to putz around with a JSON file that tells LogStash which columns are in what fields, but that's about it.

https://www.elastic.co/guide/en/logstash/current/first-event.html

I used that one before. But have switch to heidisql and can highly recommend it. Its a lot faster and only consumes a few 10Mb of memory unlike Workbench that continuously gobble up 100's Mb of memory. Still only for MySql though.

OP listed a table 'InvertedIndexEntry'. This is the same thing as a "full text" index. He is probably trying build this manually in his application code, but there is no need because a DBMS can do it for you automatically and will guarantee that it is in sync with the table:

See also:

But it is mentioned in the user manual:

> CONNECT

> Allows the user to connect to the specified database. This privilege is checked at connection startup (in addition to checking any restrictions imposed by pg_hba.conf).

Is it another way to control database access besides using pg_hba.conf (e.g. in cases which pg_hba.conf cannot be edited)?

I like the look of JSON, but I've not played with it much myself. In my case, XML is usually used out of expedience and cooperation with existing systems, not choice. ;-)

It looks like there will be some JSON support in 9.2:

All this really brings is a specific type (which prevents malformed data), and some functions for turning arrays and rows into JSON.

I read this as "Columns of type 'bit'", which makes sense in the SQL Server world - 1-8 bit columns takes up a single byte, whereas 8 integers storing a 1 or a 0 would consume 32 bytes. Really, this comes down to "use the smallest data type you can get away with".

Looking at the Postgres Boolean documentation, your Boolean looks functionally equivalent to SQL Server's bit, except Postgres takes up one byte per column. While SQL Server does understand a Boolean data type, it doesn't support them inside tables or result sets.

I haven't used it, but Datagrip seems to have a fair amount of what you're looking for. There is a trial version, but I don't think there is a freeware version. Also, Toad Data Modeler looks like it fits the bill as well. Comes in free and paid editions. I used it several versions/years back, but haven't tried the more recent versions.

Not sure if this is what you mean, but... Data may be forever. Bank records, insurance records, stock purchases, tax information, etc. may all live thru generations of application software. And at a given time several different application software packages may utilize the data. Spending time on good data architecture up front will typically prove worthwhile in that relationships and rules that can be specified, stored, and enforced IN THE DATABASE eliminates trying to implement those functions in every application that utilizes the data.

“Show me your flowcharts and conceal your tables, and I shall continue to be mystified. Show me your tables, and I won’t usually need your flowcharts; they’ll be obvious.”
— Fred Brooks in “The Mythical Man-Month”

> Current SSDs have a limited amount of writes before they start to die

Your information is out of date. Large scale studies have shown that current generation SSDs are just as reliable and last just as long as spinning disks.

http://www.tomshardware.com/reviews/ssd-reliability-failure-rate,2923.html

Unless your primary concern is disk space there's no reason to still use spinning disks in your servers. Why spend $10k on a brand new shiny server and then hobble it with old, super-slow disks just to save $1k? If you don't need terabytes of disk space you should probably get the SSDs.

there are plenty of sample databases available:

https://duckduckgo.com/?q=sample+database

Sqlite is a good choice as it's standards compliant, and popular (default on android and iphone).

You can use http://sqlfiddle.com/ to play with databases without installing anything

For example:

https://www.safaribooksonline.com/library/view/microsoft-access-2016/100000006A0608/

Microsoft Access 2016 0 REVIEWS by iCollege Publisher: iCollege Release Date: September 2017 Running time: 9:03:26 Topic: Microsoft Access Queue this whole video START VIEWING NOW View table of contents Video Description

Microsoft Access is now much more than a way to create desktop databases. It’s an easy to use tool for quickly creating browser based database applications that help you run your business. Your data is automatically stored in a SQL database, so it’s more secure and scalable than ever, and you can easily share your applications with colleagues. This course will guide you through the basics of relational database design and through the creation of database objects. You will learn how to use forms, query tables and reports to manage data. You will understand the interface, customization and creation editing of the many objects available within the Microsoft Access application. This course is divided in to three separate levels being Basic Microsoft Access, Intermediate Microsoft Access and Advanced Microsoft Access

Instead of loading up a database administration tool that could potentially expose yourself to additional risk, try something like Metabase.

I've used it in the past with success.

Metabase is the easy, open source way for everyone in your company to ask questions and learn from data.

https://www.metabase.com/

seems so:
http://www.querytool.com/help/1511.htm

But i think that portable app includes a GUI. I"ve never used it but one of my old classmates really liked these portable apps. The other thing you could do is drop you favorite linux on a usb and boot off of the stick. then you could install whatever you want.

Thx for your answer! Is there a GUI für sqllite available to query the db, much more easily, for example.: an autocompletion of the columns? At work I can use sqlDeveloper and AQT. Could they connect to sqlite?

I have nothing to do with these folks, but they have some good stuff to get you started.

http://www.vertabelo.com/home

https://my.vertabelo.com/community/database-models/most-popular/?allDbms=true

Web2py can be run as a standalone server. It comes with built in SQLite2 and webserver. It's written in Python and also can easily hook into most main databases out there with minimal code change. Lots of built-in security and ease-of-use features.

I would get a data modeling tool and reverse-engineer my database first. Most decent modeling tools have a way to export the data dictionary to Excel or to a report. Oracle has a free one.

If you don't have a data modeling tool, you can draw one online here, or get a free modeling tool from Oracle.

Jupyter is correct. https://jupyter.org/

Agree with most of the other points. I wouldn't expect tons of information on most new grad resumes. Don't list out details of the classes, just give a list of some of the most relevant courses to the job.

I'd also put academic projects first since those are more interesting and they would set you apart from other applicants at your school. This is just personal opinion though, others may disagree.

I would add a section for technical skills. You can convey some of the information you learned in those classes here. Again, just a list. I can see Python, R, SQL, Tableau, Networks. Add whatever you think is relevant to the job. As an interviewer I want to be able to glance and get an idea of what skills you have.

It's very tough to do as a student, but as you get more experience try to read this from the recruiters perspective. What did this resume just tell me? It tells me you used a lot of space to show me you took a few classes that are pretty standard and I had to go all the way to the bottom to see what sets you apart. Not all recruiters will get to the bottom before putting it aside.

Some thoughts... You can setup a local environment to try out what you're trying to do. You can setup a WAMP or LAMP stack which includes Apache, PHP, and MySQL. http://www.ampps.com/download to download it.

8 columns is nothing in a database table and rows in the thousands is a drop in the bucket as well. Any RDBMS should be able to ORDER BY (SQL) thousands of rows with no issue. However, if you get into millions of rows, you may need to consider indexing the columns if query performance becomes an issue.

Consider if you really need to have this in a website. Would a local install of an RDBMS suffice?

This seems like something like a nontechnical user would come up with. You know, when all you have is a hammer everything looks like a nail.

Sure you could do it, but why not just build a web front end? You have way more flexibility and control over what exactly you want to do. You also don't need to worry about the software on the local machines. There are also a ton of JavaScript libraries and templates out there that make this kind of stuff really easy and give you so many more features.

PostgreSQL provides table inheritance which implements this feature with minimal coding. It ensures that the columns in the parent table are also available in the child tables. Rows inserted into child tables are visible in the parent table automatically.

I see the advice by /u/aaaqqq that to just get you going, use parse...and that isn't terrible advice if you're just doing this for fun, and it isn't likely to be a project that will last years. It's good to get small projects under your belt, and learning about RDBMS principals at the start is a bit overwhelming.

If you want to learn a bit more about databases (which will help you in the long run, even if it's more up front work), you can get started by installing Postgres from here: http://postgresapp.com/

The Postgres docs are some of the best software docs i've ever seen, and they have enough information and examples to get you from knowing nothing, to expert level: http://www.postgresql.org/docs/9.4/interactive/index.html

The syntax you want is.

CREATE ROLE username WITH LOGIN;

See the the fine manual

Roles in Postgres work somewhat differently than they do in MySQL. The most salient difference being that a role is just a recipient of privileges and a role can belong to other roles ( or be a member of, same difference ). So roles can act as both groups and individual users. The login privilege allows you to login and create an interactive session.

If you have a "heart-on" for NoSQL, the PostgreSQL addon hstore can do a lot for you. It allows you to use a field in a row as a key/value store, which is great for loosely formed metadata.

This grants you the benefits of a battle-tested "real database", but with schema flexibility available where it makes sense for your application. Not to mention, individual keys in the hstore field can be indexed and queried for fast retrieval.

Heroku cloud databases support hstore out of the box.

My questions:

1: What are the relations between the items? Identifying, not identifying, many-to-many? Solid lines don't convey much.

2: What are your alternate keys?

3: Are item types exclusive sub-types? Why do so many of the sub-types have exactly the same fields?

4: I'm not seeing where you're keeping the address information for each shipment. That's generally an important thing.

Try using something like ERwin to build/draw your data model. It will be much easier to understand your data model if you're using standard notation. Don't be afraid to split things into 3-4 views if it helps things become more clear.

dbSchema is quite nice and the price tag isn't too high I think. It's also cross-platform.

TOAD Data Modeler is also quite good (although the UI feels a bit outdated and is a bit clumsy to use, but the feature set is good). But it only runs on Windows.

pgModeler is under active development and cross platform too, but you will need to compile yourself. It is only for Postgres though (but that's enough anyway ;) )

There is no stronger Access. There are similar products like Kexi or Filemaker, but that kind of database is limited in scale and capability, usually not cost effective, and typically legacy. Access also historically has a reputation for corrupting its databases with multiuser access.

The place to move up is to a real, client-server SQL database. PostgreSQL is fantastic and runs on your choice of platforms, but there's also certain cases to be made for MariaDB/MySQL, Microsoft SQL Server, Firebird, even IBM DB2 and Oracle in some circumstances.

Yeah, I'm aware of that, I didn't want to overwhelm the OP.

If all the OP wants is simply CRUD, using Django's admin functionality will mean they won't need much (or any?) SQL and very little HTML.

IMO this is a better choice than choosing something like Kexi, which might seem like a good idea at first but has its own set of flaws.

You could check out LibreOffice Base. It's like Microsoft Access, but free. It lets you design the data tables, then make a few forms to manage and display the data.

https://www.libreoffice.org/discover/base

Hey /u/R0b0d0nut,

You are correct, but we will be including free evaluation of GridDB Standard Edition on our website pretty soon, meaning you will be able to test-drive the geospatial functionality for your application.

You can also of course try it out today on AWS if you don't feel like waiting.

Some details are pretty much necessary. You can install various types of RDBMS (relational database management systems) on your computer. But that's just data storage. Usually for presentation, data entry, reporting and so on you will need the client half of the client-server model.

A fairly easy and universal way to get started would be with something like postgresql and Django. That would make it easy for you to create ways to create, read, update and delete database records using the built-in admin interface. Here are some tutorials from our good friends at Digital Ocean.

There are a lot of other solutions, but we don't really know what your problem is so we can only guess.

Check their google groups out. Still under development. I use it as my main platform and was using it in my previous job as my main deployment platform for internal corporate data store, logging & app development.

You’d benefit by understanding relational models a little more. You can have an authors dataset, as well as a books dataset, and have an authors/books relationship dataset where it is one record per author per book. This would allow 1 book to have more than 1 author.

https://www.amazon.com/Data-Warehouse-Toolkit-Complete-Dimensional/dp/0471200247

It sounds like what you might be after is some kind of big-data analysis toolset (even though you have a subset of the data). Start with this article: https://www.tableau.com/learn/articles/big-data-analytics

You haven’t said much about what your needs are, but consider that a formal database may not be what you need, if what you’re trying to do is track this information for learning purposes. There are many low- and no-code database products out there, and something like Notion may well be sufficient, if you don’t need to run arbitrary queries on your data and are just looking to track some structural relationships.

I feel like you may be getting at the heart of at least some of what I'm attempting to accomplish. One of the things I need to model is product data, and, at surface, that seems incredibly complex if not outright complicated. I've been thinking about purchasing a copy of a text containing contrived solutions to common data-modeling problems (https://www.amazon.com/gp/product/0471380237/ref=ppx_yo_dt_b_asin_title_o00?ie=UTF8&psc=1), but I'm not sure it's worth it. I strongly suspect, just given what I've seen around the web, that the model for something like product data would be, for lack of a better term, deeply nested/dimensional. (A quick Google search on ER diagrams and dimensionality is teaching me to tread lightly so as to not conflate these things.)

David C Hays wrote what is probably the first book on reusable data modeling patterns about twenty years ago. I've read a couple of others since then. But I think his is the best.

May be difficult to find now, but I'd look through its table of contents online and maybe search out a copy.

https://www.amazon.in/David-Hays-Data-Model-Patterns/dp/0932633749

I'm in a similar boat as you. I don't have a formal education for databases and I'm currently in the process of building a database for one department for my company. This book https://www.amazon.com/Six-Step-Relational-Database-Design-development/dp/1481942727 has helped me tremendously to understand what the process looks like and to take my team through each step of building the database.

As a DBA at some point you will hear these words: "there is something wrong with the database". Sometimes the wrong thing is performance related and these two resources can help you. 1. SQL Antipatterns: Avoiding the Pitfalls of Database Programming (Pragmatic Programmers) https://www.amazon.com/dp/B00A376BB2/ref=cm_sw_r_apan_4T41EFVP65298WHD53Q6?_encoding=UTF8&psc=1 2. https://use-the-index-luke.com

I'm having a hard time picturing exactly what you're looking to accomplish, but if your team is already familiar with Spreadsheets, it could make sense to use a tool that makes Spreadsheets more powerful.

Coefficient is a GSheets Add-on that can pull data directly from your Salesforce org, but can also transfer data between multiple GSheets, and can write-back to Salesforce. You could likely use this to setup multiple Sheets to allow different types of users to make different edits, and then write their changes back to SFDC.

https://workspace.google.com/marketplace/app/coefficient\_salesforce\_hubspot\_data\_conn/182621001506

Disclaimer: I joined Coefficient's team last month, but I was working in Sales Ops roles that used similar tools previously. Take a look and feel free to reach out if I can answer anything.

I actually recommend https://airtable.com for this instead of a database. You can create forms there very easily and embed them. Then their inputs show up as rows in a spreadsheet.

RStudio if you want to do data analysis and visualization.

Also, look into python, Pandas, NumPy and IPython. Additional tools would be Matplotlib and Jupyter Notebook.

Python for Data Analysis: Data Wrangling with Pandas, NumPy, and IPython

https://www.amazon.com/dp/1449319793

"There is no such thing as a junior dba."

https://groups.google.com/forum/#!topic/comp.databases.oracle.misc/Hri0LD62FFc

http://www.reddit.com/r/Database/comments/2cdzpu/dbas_how_did_you_get_into_the_field/cjektvk

what is amount of data you talking about? sqlite does easily go into tens of gigabytes, especially fast if you also have sqlite extensions (like fulltext).

if you need something heavier, postgres is like 20mb install on the disk (maybe more with extensions), and it doesn't need a lot to run (8mb per connection?).

> I are trying to optimize for low latency, high-throughput.

low latency is all about having proper indexes, and querying only across indexes.

high-throughput is about caching. cache results often. reply in your api that the cache should be kept for 15minutes or longer.

> I want to optimize for low cost

low cost is about using small vps box, soyoustart (ovh) has decent deals for ram+storage combinations https://www.soyoustart.com/ie/essential-servers/ or hetzner https://www.hetzner.com/sb

finally, you could precalculate (cache) some reports, so that there is less to calculate live ... One way is to cache hourly summaries, and only calculate live last hour.

Also remove some heavy search options, or charge customers more for ability to run more complex ad-hoc queries.

It sounds like you're looking to build your own from scratch, so an existing app isn't really the answer you want. But with that said Cronometer is basically what you're talking about creating. Great app, and allows users to report their fully detailed input to a trainer.

Caveat: I know very little, particularly about LabVIEW

Firstly though, "Microsoft SQL Server" is a separate database system, so the provider you've chosen won't work with MariaDB. You need a driver (connector or provider are probably equivalent terms) for the specific database system you're using.

Is there a particular reason why you've chosen MariaDB? It's a fork of MySQL and probably fine, but it would perhaps be easier to find tutorials for a more common server. I would have tried something mainstream like Postgres or ideally SQLite, but it seems LabVIEW only has official support for Oracle, Access and Microsoft SQL Server. There's a free "Express" version of SQL Server which might be suitable for your use case: https://www.microsoft.com/en-us/sql-server/sql-server-downloads

with no idea about what "process" really means I would guess the most efficient method would be a O(n) algorithm. Vague answer for a vague question.

Here is a detailed description of Big O notation: https://www.freecodecamp.org/news/big-o-notation-why-it-matters-and-why-it-doesnt-1674cfa8a23c/

Depends on the scale of your project.

There are two ways to go about it:

- If it's a small scale project, you can consider easy to use frontend online database platforms like Stackby and use the Developer API into your application. Best for small scale prototypes, MVPs etc. Best to use such platforms on daily basis on what you can achieve with databases.

- If it's a large scale project, that's when the large scale SQL/No-SQL databases come into the picture.
Here's a quick article on what are relational databases. They talk about one-one, one-many, many-many relationships.

Assuming you're already employed somewhere.

Step 1: In the company you're currently working with, is there a database development or administration team? If yes, find out which kind of databases are being used? If they are already using a RDBMS db (oracle, MS SQL Server, MySQL), get access to the dev/non prod database.

Step 2: Learn SQL/PLSQL/Shell Scripting(BASH)

Step 3: Take DB Admin 101 class for whatever RDBMS software you have.

Step 4: Look for Jr. Developer positions at your company or outside. Then switch.

You may also look at this post I made a few days ago. It's kind of related to your question.

https://www.reddit.com/r/Database/comments/3skj2k/despite_being_certified_in_oracle_database_i_was/

I am planning to create a course that includes everything that is mentioned above. If you're interested, please take this survey.

https://www.surveymonkey.com/r/FCVN3DZ

You will find some good courses on udemy for the same topics.

Knowledge of internal workings of a Database is also helpful in design. Check these 2 books,

Database Internals

Designing Data-Intensive Applications

What is your opinion of this thread?

https://stackoverflow.com/questions/14625701/designing-an-e-commerce-database-mysql

Right now I'm looking into doing option 3 indicated in post above

Having a normalized with really unnormalized 1:1 relationship with something like this:

http://i.imgur.com/6wfSgWQ.png

Then queries per categorical table to pull up things like list prices, weights, length, width, height, etc on my normalized productTable

I can understand how to maintain a database once its up, but I have a bunch of normalized and unnormalized data sitting in spreadsheets right now (some webscraped data, some manufacturer spreadsheets, some 3rd party data) where each data has something different to offer

my problem is initialization and then workflow of adding new data after (create, update)

ya I'm sure a QT app would take the same amount of time as an Access one lol, and cost the same

you don't need much VBA if making simple forms. If vba's not being actively developed but does the job, does it matter? It's not going away with a billion lines of vba in excel files out there.

https://stackoverflow.com/questions/187506/how-do-you-use-version-control-with-access-development

https://stackoverflow.com/questions/3569362/unit-testing-in-ms-access

I would recommend ElasticSearch with ingest attachment plugin, see https://www.elastic.co/guide/en/elasticsearch/plugins/7.6/ingest-attachment.html.

Elastic is IMO the easiest way of creating a very capable search solution for textual data. You should not, however, think of it as a database, it lacks a lot of features in that area. But as an index and search engine, it is really, really powerful and relatively easy to set up.

It's really simple, and you should start learning the correct way of doing things. Use not only OOP but also "prepared statements".

Follow this - though, instead of doing an INSERT, do a SELECT (use PHPMyAdmin to display examples of SELECTs)

With the results, you can do a var_dump to see what you got, and later you can do a loop, so you can format each result properly.

Ibm db2 has community edition, which is free for local installs on small machines. Less than 4 cores and 16 gb of memory, no data cap, no time limit, and is available as a docker image. Also free is db2 lite on the ibm cloud. The limit there is a data size one (200 mb, I think), and concurrent connections, and if you do not use it for 30 days. They shut it down.

https://www.ibm.com/products/db2-database/developers?utm_content=SRCWW&p1=Search&p4=43700054712996477&p5=e&gclsrc=aw.ds&gclid=Cj0KCQiA47GNBhDrARIsAKfZ2rD4zGC0DzCKSFsrnMkJd9FMJTJb73Q1vgLN_8MbPE_5gFOigqPZIpsaAh6LEALw_wcB

https://cloud.ibm.com/docs/Db2onCloud?topic=Db2onCloud-free_plan#:~:text=The%20IBM%C2%AE%20Db2%C2%AE,Only%20community%20support%20is%20available.

The compatibility modes were introduced in MySQL 4.1. In MySQL 5.6 in new configurations, the strict mode is on by default (2013). They will be on by default for MySQL 5.7: https://www.digitalocean.com/community/tutorials/how-to-prepare-for-your-mysql-5-7-upgrade

https://www.scribd.com/fullscreen/45931272?access_key=key-iykypklqvwg2odxcz5r&allow_share=true&escape=false&view_mode=scroll This is the database I am trying to model my reservation system on, but couldnt get the relationships working.

Maybe look at this:

https://www.elastic.co/blog/text-similarity-search-with-vectors-in-elasticsearch

Here’s the sample repo

https://github.com/jtibshirani/text-embeddings

Also elasticsearch is a nosql dB so you can insert docs as json.

I've given you most of what I personally know, so I'd open the floor here to others who have different experiences than me. But, to sum up, any time I see any of the following, I at least consider Elastic.

data that needs to be user-searchable
a need to write custom indexing rules on string content
a need for sub-field granularity or any sort of list data in a field
a requirement that would ordinarily require me to create some sort of massive, ever-expanding lookup table

Additionally, it sounds like you're doing telemetry work. That's a very, very common use case for Elastic, and they've got a webinar that explains how their products can be used in that situation here: https://www.elastic.co/webinars/using-the-elastic-stack-for-sensor-data-telemetry-and-metrics

Good luck!

In that case, I'd definitely look through Hoffer, Ramesh, and Topi. It'll give you some structure, rather than just casting about on Google.

(Incidentally, that's the strategy that got me my first real tech job--reading through a book to get the basics down.)

To get some hands-on experience with an actual database, you could sign up for the Amazon Web Services free tier and install MySQL on an EC2 machine.

Edit: If it's a grant-funded gig and you're working as a graduate student, it will almost certainly be cheaper to hire you, as contingent labor, rather than a professional DBA or data engineer or whatever they're looking for.

Just to follow up, the "effective_io_concurrency" setting is just a tunable to further actually set prefetch pages to use when doing a heap scan. It may not be as explicit as what is available to oracle batch processing, but it's probably a good optimization for a lot of pg users and hardware.

The postgresql config code is interesting.

I finally found the material that made my lead chose this DB, they have a use case scenario where it really fits what the company is gonna do, except it's Fleet's of aircraft instead of sucker rod pump and oil rig, here's the PDF i uploaded to docdroid arango's whitepaper.

What do you think? Seems way too good to be true

Yeah, I'd probably use a database for something like this.

If you've got the basics of SQL down, I think you'll find that for your use case, converting your spreadsheets into database tables won't be so bad.

Making an application around the database with nice forms and reports, however, is another story. I suffered through a lot of MSAccess when I was learning. With Access, step 1 was "Make some tables", and step 2 was "Learn VBA". I still don't think it's terrible for what it is, in that it's an off-the-shelf suite that provides enough tools so that you're not directly editing rows and columns in manner similar to a spreadsheet...but it wouldn't be my first choice these days.

LibreOffice Base (which I tried several years ago) and KEXI (which I haven't tried) are some free/open-source alternatives. They may help get you started in what you're looking to build.

Have not used it, or even tried it, but for a small setup like you are describing perhaps Airtable may be worth checking. Their tag line is "The power of a database with the familiarity of a spreadsheet"

Given your your background, may be a little too much to try to go and build an app yourself or manage a database (in cloud or self hosted). So, perhaps a solution like Airtable may be a starting point.

In my opinion starting to identify what you need to collect and start to put it somewhere, will get you the most benefit while you figure out long term what is the best storage / stack / visualization for the data.

~~ESPN has an open API.~~ Here is a long list of others.

You can certainly use ElasticSearch to search by particular fields in a document, but running ES in production is quite complex.

Typesense is another way to do this, and has a much simpler intuitive API. There’s also a guide on how to integrate DynamoDB with Typesense here: https://typesense.org/docs/0.21.0/guide/dynamodb-full-text-search.html

Hi Nicadimos, slava @ rethink here. Rethink has binary packages for 64-bit ubuntu, we'll be adding more, but due to slight differences in linux distros, it's hard to make them work across ubuntu/debian/mint.

As far as the syntax errors, I have a couple of quesitons:

How did you build from source? What exactly did you do?
Are you running the query in the Data Explorer or via a client driver?
If running through client driver, which language are you using?
What is the exact syntax error you're getting?

It'd be easier if you asked the questions on our github page (https://github.com/rethinkdb/rethinkdb/issues) or google groups (https://groups.google.com/forum/?fromgroups#!forum/rethinkdb) until we get a version of stack exchange going.

Cheers!

There is open source software out there that will probably do what you want. Odoo is a python based one that comes to mind.

https://www.odoo.com/page/all-apps

Ah this is bcoz on os x the user posgres does not exist by default.

You need to add your db name, username and password in config.py before you run db.py migrate

if you are using http://postgresapp.com/ then "createuser postgres" from your terminal should do it else log into psql and run createuser postgres

you will loose some data from the buffers and maybe some indexes (you know they are not written to ~~stone~~ disk ) these databases are known as corrupted and can be recovered with gfix -m http://www.firebirdsql.org/manual/gfix-dbverify.html the end result is if you don't need too much write performance then it's better to have the data on the disk (and guaranteed written on it) with fsync http://www.kernel.org/doc/man-pages/online/pages/man2/fsync.2.html

What kind of application?

Are you looking for C#, Java, or some other driver?

There's a TV database that has an API.

https://thetvdb.com/api-information

But that's more just like episode information.

You might be able to use Wikipedia's API to get topics and categories but I'm not sure if there's just csv's laying around.

https://www.mediawiki.org/wiki/API:Main_page

Have you actually been trying to figure this out for 7 years? Or is there another reason you would copy/paste word for word from this 7 year old post from another forum?

Most wiki's will require a backend DB. Atlassian Confluence would meet your needs expect for cost. Maybe something like https://www.bookstackapp.com/ would work for you or another open source product.

There are a lot of web based visual database management system aimed at end users. You probably want one of those, rather than having to administer your own DBMS. Unfortunately, I don't think there's a widely accepted name for this category of software, so it can be hard to find all the options. A good place to start might this page on alternatives to Microsoft Access.

Definitely skip sqlite - even with faster storage you can have blocking issues if both programs need write access to the database. Here are two approaches that will work with Tornado's IOStream class:

1) Included in the pyserial is a serial to TCP bridge example program - run it on your serial port and connect to it using IOStream (example in the IOStream docs).

2) Run your data reader program via subprocess and use PipeIOStream in Tornado to read standard output and and shoot it down your web socket.

In both cases register a callback with "read_until" and "write_bytes" and shoot data down your websocket accordingly.

Yup, you need a database. Looking at the dataset size (I'm guessing little to no normalisation), maybe a Time Series DataBase would be a better fit?

https://www.influxdata.com/time-series-database/

https://www.guru99.com/database-normalization.html

You either have money (pay somebody to pick the tools & design the DB) or you have time (learn about these technologies yourself). Getting started without understanding some of the basics will result in astronomical cloud provider bills and very little progress.

Ah that's neat :) I'm not too experienced with databases personally, is that more of an SQL feature?

I have seen InfluxDB seems to address my concerns in a similar way if it's an appropriate database for statistical queries(I think that's what I'm discussing here, such as likes or upvotes to score/rank content to viewers).

https://www.influxdata.com/time-series-platform/influxdb/

> InfluxDB can handle millions of data points per second. Working with that much data over a long period of time can create storage concerns. InfluxDB will automatically compact the data to minimize your storage space. In addition, you can easily downsample the data; keeping the high precision raw data for only a limited time, and storing the lower precision, summarized data for much longer or forever. InfluxDB offers two features—Continuous Queries (CQ) and Retention Policies (RP)—that help you automate the process of downsampling data and expiring old data.

> I'd feel terrible about "semi-faith-based" solutions.

Me too actually. However, I want to build something similar to CamelCamelCamel and they don't include any indicators whatsoever. So this would be a huge step up.

My bigger concern actually was just in poor database design. I could store all of the updates & create a view, but the amount of data would be obscene. Assuming I were unwilling to keep all of that info, would this be an acceptable database design?

EDIT: I chose a bad example. Products with strong fluctuations in price are generally popular & thus accurate. Those with minimal fluctuation span months with the same price. The user needs to know if the price really did not change or if it wasn't checked due to lack of popularity.

Yep, I got it pretty much all working.

I'm using http://fastglacier.com/ w/ a SQL Server Agent Job to automate uploading the backups to glacier. All is going well except for some odd configuration issue, but I have it working manually, at least. :)

Technically, yes, but should you? i would say no unless you change the db's port. Standard db ports are well known and constantly scanned for. Even if you change the port and if others find your db, they could get a foot hold in your home network. Also, some internet providers do not allow you to have servers accessible from the world.

I assume it is just a class of people hitting the db? xampp might be a good solution. Letting the students know it is running off your laptop will set their expectation for performance.

Microsoft has a free trial for Azure which includes sql server. http://azure.microsoft.com/en-us/pricing/free-trial/

Well, this is going to be quite a task.

First, you'll want to setup an MS Azure DB.

Once that is situated then you can start to create the necessary patch work of code to replicate the records from the tablet/Access DBs up to the Azure DB.

Actually, start off by researching the concept of ETL, Extract, transform, load.

Plan, Plan, PLAN! You should spend 95% of your time planning and the remaining 5% of the time should be spent exciting the plan. It will be a nightmare to restructure even a single field once the DB is in production.

Yes, in fact the "Getting Started" guide includes login details for a demo database that is currently living on SQL Azure: http://querytreeapp.com/help/

You will need to grant access to your IP address in the Azure control panel as well as using the right server, database and user names: http://azure.microsoft.com/en-us/documentation/articles/sql-database-get-started/#ConfigFirewall

What are /r/Database's favorite Products & Services? From 3.5 billion Reddit comments

The most popular Products mentioned in /r/Database:

The most popular Services mentioned in /r/Database:

PostgreSQL

Airtable

Amazon Web Services

MongoDB

Microsoft Azure

DigitalOcean

elasticsearch

LibreOffice

Vertabelo

SQL Developer Data Modeler

LibreOffice - Base

Kexi

Stack Overflow

ArangoDB

Firebase

The most popular reviews in /r/Database:

What are /r/Database's favorite Products & Services?
From 3.5 billion Reddit comments