Data Ownership

From Knelcorpwiki

Jump to: navigation, search

Contents

Data Ownership Project

This wiki page is where I'm capturing all things related to data ownership.

Delivered presentations

I've presented on this a few times.

  • SFVLUG - March 6th 2010 - download pdf
  • UUASC - March 8th 2010 - download pdf
  • SGVLUG - March 11th 2010 - no download. same as others, just different venue/date.

Why am I doing this?

  • I help my clients build systems and processes/policies to own their data. I need to drink my own champagne
  • It sounds cool?

How am I going about it?

Step 1: Map out where my existing data is

  • Data that is generated by daily activities
    • health data (stored on servers at my health care provider. Generally not easily accessible)
    • financial data (stored by financial institutions. Generally easily accessible in the form of CSV dumps)
    • search data (stored by search engines. Generally not accessible)
    • requests for directions (stored by various providers. Generally not accessible)
  • Data that is created by me
    • Textual
      • Blogs (often hosted on wordpress/blogspot. easily accessible and movable)
      • Tweets (hosted on twitter. easily accessible)
      • Social networking posts (hosted on various sites. accessibility varies)
      • Documents/Invoices/Spreadsheets
      • Financial data
      • Analytic products
    • audio
      • Podcasts
      • Voice notes
    • video
    • pictorial

Step 2: Evaluate replacement systems

Type of service/application What I was using before and/or what others are commonly using What I have migrated to Notes/URL
News Google news/google reader Dashboard ,Tattler and rss2email http://dashboard.knownelement.com (haven't deployed tattler yet). In the very early stages of using this system. Existing usage I have seen is quite cool.
Photos flickr gallery2 http://photos.knownelement.com (imported all my flickr photos and now upload straight here)
Microblogging Twitter status.net http://mblog.knownelement.com (bridges my posts to twitter, pulling posts in from twitter fails a lot due to twitter scalability issues)
Blogging livejournal/blogspot wordpress http://blog.knownelement.com (very happy with wordpress. imported all my lj posts, need to pull in my blogspot posts)
Issue tracking and software project management Basecamp/lighthouse/sourceforge/google code redmine http://redmine.knownelement.com:3000 (was a heavy trac user, but redmine won me over with the modern UI, thick client and sub tasks/tickets)
Invoicing clients Bye bye freshbooks / simple text files / spreadsheets generated manually argentuminvoice http://invoices.knownelement.com (pretty happy with this software. produces very nice looking invoices)
URL shortener Bye bye tr.im/tinyurl casimir http://url.knownelement.com
Knowledge management (never really used any other systems other then text based note files) mediawiki http://wiki.knownelement.com
Centralized login claimid/facebook connect Active Directory/OpenLDAP/Kerberos/FreeRADIUS and OpenID (phpMyID)
Collaboration Google docs/skype/webex/instant messaging Openfire jabber server with Kraken for aim/msn/yahoo/irc interop and Karaka for skype interop, askterisk/freeswitch/pbxinaflash for secure voice chat, openmeetings/ bigbluebutton for webex replacement and eyeos for collaborative document editing. http://desktop.knownelement.com hosts my eyeos instance, jabber is only available via the vpn. Voice stuff and Bigbluebutton/openmeetings not yet deployed. That's really it's own project.
Data Sync (between mobile devices and "the cloud" Google/yahoo sync software, active sync, bes funambol + zimbra and exchange
CRM salesforce.com missing application
E-mail Gmail Missing application (zimbra is nice but doesn't have all the features gmail does) N/A
Note taking evernote Using Zimbra for this
fleet tracking latitude missing application (evaluating http://opengts.org/)
Calendar/tasks Thunderbird+Sunbird/Outlook missing application

Step 3: Assemble the infrastructure for hosting

Physical infrastructure

Servers

VM servers:

proxmox-prod-2:

prod virtual machines: www,mail,dns,opsview,ziptie on bare metal: torrentflux,mediatomb,firefly,dhcp

proxmox-prod-3:

development/test/desktop virtual machines bare metal: dvr for surveillence,bluetooth for voip

Other servers:

livingroom:

active directory,exchange,terminal services,radius,network access protection


Network Gear
  • 1 Motorola DSL modem
  • knel-prod-router:

pfsense appliance that is the core router/firewall for entire network infrastructure also provides numerous network related services (snort,av,proxy,traffic shaping,modsecurity)

  • knel-prod-cs1:

3750 poe switch core switch for entire house network. provides power to ubnt nanostation2 ap on the roof, mesh potato access point in the office and garage

Details of configuration can be found at Network_Stuff#Production_Network wiki page.

Other Gear
  • APC UPS (cisco gear plugged into surge only, usb drive and dell optiplex plugged into battery backup)
  • 1TB USB drive
  • printer:
  • ups:

supports proxmox-prod-2,pattis hard drive,main data drive

Server Software

E-mail bits

Sending e-mail from a "dial up" IP range can be a royal pain. Some things to help:


Found

via http://www.dslreports.com/faq/14282



For DSL DNS Needs, call:
DSL Provisioning 800-833-2120 Options 1, 2, 1

For Dedicated Access DNS Needs, call:
Dedicated Enhanced Service Center (DESC) 1-866-937-3664, Options 3,5

For SBCIS Sales, call:
1-888-724-7253

For Web Hosting services, call:
Web Hosting Sales: 888-WEB-HOST (1-888-932-4678) 

Jabber Bits
Web Software Bits
LDAP Software Bits

Took me a bit of searching. Came across

Also some tools:


Client Software

E-mail
SIP
Jabber

Step 4: Migrate Data

Step 5: Host data in a sustainable fashion

Backups:

  • s3
    • cost effective
    • off site
    • easy to implement
    • you wanted to actually restore those backups? :)
  • local storage
    • cost effective
    • want off site and rotation? just buy a few drives
    • easy to implement
    • restores very nicely
  • replication
    • mysql (instructions here)
    • dns (instructions here)
    • apache
    • linux ha

Security:

Software/logical
  • snort/securita
  • greensql
  • logwatch
  • openvas
  • awstats
Physical
  • guns
  • alarm systems
  • dogs
  • bolting the gear to the rack :)


Monitoring:

Internal
  • OpsView
  • Netdisco
  • Rancid
  • apt-get install logwatch snort and set root to a live e-mail address. Provides an excellent daily summary of activity on the system.
External
  • Nothing at the moment.

Get all the details at my Monitoring_Alerting_and_Network_visualisation wiki page.