smolt privacy policy

Wed Feb 21 03:55:31 UTC 2007

Christopher Blizzard wrote:
> On Mon, 2007-02-19 at 12:08 -0600, Mike McGrath wrote:
>   
> That aside, I think what you have is a good start.  I would start by
> listing out what we're collecting, how we connect that to people (or
> not) and how we're going to use it.  And start out with why we're doing
> it so that people understand our motivation.  Or, another way to put it,
> what is the acceptable use policy for the information and how it affects
> others.
>   
Yeah, it'd be good to explicitly state what we're going to do with the 
data and then actually follow it as a benchmark for how useful smolt is 
to us.  If we say we're going to do a bunch of things and this time next 
year we haven't, that would be bad :-)
> Google's privacy policy is pretty good for its format.  (I won't comment
> about the content.)
>
> http://www.google.com/privacypolicy.html
>
> The EFF has some decent resources:
>
> http://www.eff.org/Privacy/
>   
I'll grab some ideas.
> But that aside, I think that we need to lay down some ground rules for
> what we want to have as outcomes.  Here are my personal views on what we
> should try to explain in the policy:
>
> 1. That we collect information about the hardware you have in your
> machine as well as things that are connected to your machine.
>   
+ packages in the near future.
> 2. That information is linked with a unique identifier, if the user
> chooses to provide one.  This identifier is only there to determine if a
> driver breaks or gets better over time.  (It's not just about leverage,
> it's also about quality metrics we can add later.)
>
> 3. That unique identifier is never connected to an IP address.
>
>   
Thats not quite true...  It is linked in the web logs and that is by 
design (abuse prevention / correction).  Those logs are kept for a 
finite amount of time and is listed in the policy.
> 4. Information about hardware is only released to the public in
> aggregate.  That is, we will never release information about a specific
> users, only about trends and groups of users.
>
> 5. That anyone who has access to the raw data that makes up the
> aggregate will be required to enforce this policy and will not release
> specific information to the public.
>   
I've debated this off and on.  Honestly I think it would be great to 
make the databases available to the public, I can't imagine what harm it 
would do and people could run their own queries against the database as 
they want to.  For some reason though, in the back of my mind this seems 
like a bad idea.  I can't give specific reasons why.

    -Mike