Sample data create and import: Difference between revisions

From OpenPetra Wiki
Jump to navigation Jump to search
Line 27: Line 27:


The majority of the software listed below was extracted from former page.
The majority of the software listed below was extracted from former page.
Idea to be checked: use data from generatedata / geo-database / briandunning together benerator to compile data to common format, which is then imported as shown below.


{| border="1" cellspacing="0"
{| border="1" cellspacing="0"
Line 36: Line 38:
! App-Type
! App-Type
! License
! License
|-
| * (if this is as good as it claims, it is it!)
| [http://databene.org/databene-benerator.html benerator]
| transforms given data to test data (includes various filters)
|
| various databases, xml, csv, excel
| Framework
| GPL / commercial
|-
|-
| *
| *
Line 44: Line 54:
| Webapp (JS,PHP,MySQL)
| Webapp (JS,PHP,MySQL)
| GPL v2
| GPL v2
|-
| *
| [http://www.webresourcesdepot.com/free-geographical-database-of-all-countries-over-8-million-places-geonames/ Geographical Places Database]
| geographical locations (schools, universities, whitehouse, eiffel tower...)
|
| tab delimited
| website, download, libraries (various languages), webservice
| creative commons attribution
|-
|
| http://www.briandunning.com/sample-data/
| Website with real address and company data (US and Canada) but with fake names. This could be useful with testing map services as well since there are real geographic locations.
| US, Canada
|
|
| free
|-
|-
|
|
Line 101: Line 127:
|
|
| contains at least some interesting resource data (German Streets, European Cities Zip Codes
| contains at least some interesting resource data (German Streets, European Cities Zip Codes
|-
| *
| [http://www.webresourcesdepot.com/free-geographical-database-of-all-countries-over-8-million-places-geonames/ Geographical Places Database]
| geographical locations (schools, universities, whitehouse, eiffel tower...)
|
| tab delimited
| website, download, libraries (various languages), webservice
| creative commons attribution
|-
|-
| nn
| nn
Line 133: Line 151:
| Website / Web API
| Website / Web API
| proprietary for API (kostenlos, but attribution)
| proprietary for API (kostenlos, but attribution)
|-
| * (if this is as good as it claims, it is it!)
| [http://databene.org/databene-benerator.html benerator]
| transforms given data to test data (includes various filters)
|
| various databases, xml, csv, excel
| Framework
| GPL / commercial
|-
|
| http://www.briandunning.com/sample-data/
| Website with real address and company data (US and Canada) but with fake names. This could be useful with testing map services as well since there are real geographic locations.
| US, Canada
|
|
| free
|-
|-
|   
|   

Revision as of 10:07, 19 April 2011

Data creation and import is split into two tasks:

The import is done via import file. Decision: simple for user.

Keeping the focus: the focus is creating sample data for the database, not import/export.

The above points were decided to be done this way in a phonecall with Timo.

This page aims to act as whiteboard for displaying current state and solving this task. The task itself is tracked in the two issue stated above.

Creating sample data

Goal: creating sample data for the database.

The sample data should have

  • many donors
  • many recipients
  • many donations

Lists of test data generators:

The majority of the software listed below was extracted from former page.

Idea to be checked: use data from generatedata / geo-database / briandunning together benerator to compile data to common format, which is then imported as shown below.

interest? Program creates area Output App-Type License
* (if this is as good as it claims, it is it!) benerator transforms given data to test data (includes various filters) various databases, xml, csv, excel Framework GPL / commercial
* generatedata.com Addresses / Cities / Countries Netherlands, Canada, UK, US XML, Excel, HTML, CSV, SQL Webapp (JS,PHP,MySQL) GPL v2
* Geographical Places Database geographical locations (schools, universities, whitehouse, eiffel tower...) tab delimited website, download, libraries (various languages), webservice creative commons attribution
http://www.briandunning.com/sample-data/ Website with real address and company data (US and Canada) but with fake names. This could be useful with testing map services as well since there are real geographic locations. US, Canada free
DBMonster generates test data SQL Command-Line (Java) Apache License
CSV Data generator CSV? (Ruby)
Datagenerator library / GUI GPL
dqMaster text,xml,db GUI (extensible)
Spawner Data Generator random proper names, terms and connectors delimited text / SQL apptype license
Test Dictionary java interface
data only Fresh Trash Generator Random Website, Email, Family and First Names, Phone Number, Company, Birthday Greek Names and Companies, German Streets java utility package contains at least some interesting resource data (German Streets, European Cities Zip Codes
nn google api toolkit nn Web API
- Data Science Toolkit convert address to coordinates, vv, ip to coordinates etc Web API / VM
- fakenamegenerator.com Names,Adresses from many countries Website / Web API proprietary for API (kostenlos, but attribution)
GEDIS Studio for Test Data "Realistic Test Data" (not viewed) CSV, XML, SQL, or HTML Windows / Scripting community edition kostenlos / commercial
- Excel random data generator Generates sample data, somewhat acclaimed here MS Excel Plugin commercial


Coding

Some coding has been done already: See csharp\ICT\PetraTools\GenerateSampleData for transforming sample data into family records etc.

Also see partner import module, which processes csv and yaml files. csharp\ICT\Petra\Client\lib\MPartner\gui\PartnerImport.ManualCode.cs

Importing sample data

The import is done via import file. Decision: simple for user.

Keeping the focus: the focus is creating sample data for the database, not import/export. Import/export is a simple tool - which we put effort into, to keep it nice and simple and easy to understand. But in this case, a tool for sample data only.

Make the import file as simple as possible for the user, e.g. consciously limit the scope of the import files capability (one address per person), but rather not powerful import-file.

Concider data liberation?

Not necessarily - only if useful to keep it simple and make it work quickly.

Intended location of data in OpenPetra

Data Table
Person p_family
Address -
Donations -

p_family will be used for all data, and p_person ignored (This is in line with the attempts to replace p_person by p_family).