Category: internet

Vendor Hosting, SaaS, Cloud Computing: What’s the difference?

When talking with non-IT people, it’s amazing (ok, not really) how much confusion there is about concepts and what these terms mean. While the terms are related, they are far from interchangeable. People soemtimes like to throw around “buzzwords” in order to sound like they’re knowlegable or on the cutting edge. If you deal with this stuff, it’s good to understand the differences.

Vendor Hosted

Vendor hosting is a relationship. Vendor == someone you’re paying to do something for you. Hosting == the something you’re paying them to do. You don’t own the hardware, they do.

This can be wonderful if you won’t want the expense and complexity of deploying and maintaining the hardware, and if the vendor is good at doing those things.

But it can suck if your vendor doesn’t provide good service, or shuts down suddenly, leaving you with no access to your data.

Software as a Service

SaaS is a business model. With software as a service, you’re paying like you pay for a utility, based on your usage, or subscription, based on time. Like utilities, if you stop paying, you can’t use it anymore.

Contrast SaaS with Free software and licensed software models.

Free software is, well, free. Often the software itself is free (cost), but you pay for support or for customization, or by devoting resources to maintaining the codebase.

Licenses are typically perpetual, but limited to a major release version and its patches. When the next major release comes out, you have to pay for the upgrade. Usually the vendor compels their customers to upgrade by dropping support for the old version, offering discounts on upgrades for existing licensees, and so on.

Unlike the licensed software model, with SaaS you’re usually entitled to the latest version, not paying for upgrades. Often, but not necessarily limited to, vendor hosted solutions.

This can be wonderful if:

  • You hate managing licenses.
  • The upgrade treadmill gets you down.
  • You’re cool with fixed costs and realize you can either pay the vendor once a year for the next version or the same amount of money over the course of a year for the SaaS model.
  • You don’t use an expensive solution very much, but would be in the market for a solution you can spend a small amount on for what you do need. Like, you’re not going to spend $25000 for CAD software, but if I could pay $25 to use it for a few hours, you’d be able to design that thing you want to build for your hobby.

But it can suck if you want to pay for something once and own it, or if the SaaS vendor goes out of business or decides to stop providing the service, or if for some reason you prefer to remain on a certain version of a product that meets your needs well enough, such that you have no compelling need to upgrade, or if the SaaS vendor’s upgrades take the product in a direction that doesn’t suit your needs.

Cloud computing

Cloud Computing is an architecture. A distributed cluster of redundant, co-located, and load balanced hardware, typically spanning multiple datacenters, which is used to run virtual machines which are not dedicated to specific hardware. The amorphous nature which decouples virtual machines from any specific node in the cluster is what gives rise to the term “cloud” computing.

“The cloud” may often be used as a generic term for “the internet” because on networking diagrams, the internet is symbolized that way, because the specific architecture of the internet is unknown and not really of concern to you. But just because something is on the internet does not mean it is “cloud computing”.

The cloud computing cluster’s capacity is typically shared among many customers. Due to the nature of the cluster, it is possible to scale a customer’s utilization of the cloud very rapidly, growing a virtual machine from a single node on the cluster to several or many. Conversely, the cloud may idle portions of the cluster when demand is low, resulting in great efficiency and flexibility. If a node in the cluster goes offline or experiences hardware failure, the cloud continues chugging along and only the admins will even notice that anything happened.

Cloud computing is also typically vendor hosted, since it requires specialized professionals to set up and maintain the cluster. Very few entities are in need of owning their own cloud, and will buy cloud computing in the SaaS model from a vendor who hosts it. You can pay for a tiny portion of the cloud, and as you grow you can expand your footprint on the cloud, paying more for it but generally not having to deal with many of the usual headache associated with scaling up a rapidly growing enterprise.

This can be wonderful if you can live with not owning your hardware and are comfortable with the idea that your data and processing could be “anywhere” in the cloud, co-mingling with other virtualized instances running on the same node(s) in the cluster.

But it can suck if your vendor decides they don’t need you or gets a request for your data from the government and just rolls over and complies with the request, rather than fighting it like *your* legal team would.

Bottom line

Any time you put your data or services in the hands of someone else, you really are putting a tremendous amount of trust in them. You need to consider not just the benefits of the relationship, but the risks as well.

Outsourcing is attractive because it’s expensive and difficult to do well, and if it’s not the core competency of your business to do enterprise IT, it may make sense to have someone else do it who has that expertise. But you need to make sure that your vendor actually does have that expertise!

The pre-sales people will promise you the moon, but you need to be able to verify that they can back those promises up. One of the great selling points is that you don’t have to worry about all the complexity of maintaining IT infrastructure anymore. But it’s not like that responsibility disappears into nothing — it’s the vendor’s responsibility, and you now need to worry if they’re doing it right. If you’re paying someone else to do it right because you don’t know how to do it right, and you’re trusting them to do it better, or cheaper, than you could, how do you do so with confidence that they’re really that good?

Usually, outsourcing puts up organizational boundaries which can be barriers to transparency — you can audit internal processes much more readily than external ones. But are you any better at auditing IT than you are at running IT? Probably not. There may be third party auditors who you can outsource to… but how do establish that you can trust them? 

So how do you establish trust? This is something you need to figure out before you move into a vendor hosted solution. One approach is to move gradually. Do a pilot project and see how it works over a period of time before jumping in with both feet. Plan for moving off of the vendor’s services so you can do so with short notice if need be. Utilize redundant vendors if that’s feasible, so all of your eggs aren’t in one basket.

Work at establishing trust continuously — trust is not a “set it and forget it” kind of thing. Trust comes from a relationship based on iterative transactions. It is built with each interaction that you have. Despite sales pitches, outsourcing does not mean that you don’t have to worry about things anymore. It means you need to establish a management relationship with people outside of your organization who are assuming responsibilities and risks which still affect you. It pays to maintain a close relationship with them.

If you’re not engaged with your vendors and don’t have good rapport with them, then chances are very good they’re going to ignore you, take you for granted, and fail to understand what your needs are, and when it comes time to deal with problems, you won’t be familiar with each other. Communicating effectively and working together will be a second-order problem that prevents you from fixing the original problem.

Don’t be lulled into a false sense of security, and don’t make the mistake that because something is the vendor’s responsibility, that you don’t have to worry about it. It’s still your enterprise that’s at risk. When everything’s running smoothly, it’s great, but problems may actually become more complicated, due to the vendor relationship and the fact that you don’t have direct access and full control over the systems they are managing for you.

Risk management is incomplete without an answer to the question, if the vendor goes away tomorrow, what do you do? If maintaining control over your data is critical, it’s a good idea to require that you have a local copy stored on hardware that you own. At the very least, the vendor should make it easy for you to export your data, ideally to a documented format which will be useful for you in the event you need to switch vendors.

Redefining “success” for the Kickstarter bubble crowd, and why you shouldn’t.

So, this article has gone around and gotten attention. It’s an interesting topic, understanding the factors that contribute to a project raising its startup funding from “the crowd” successfully, but I want to take a moment to divert on to a tangent for a bit, and take issue with their definition of “Kickstarter success”.

This is important, because if Kickstarter is to succeed at changing the world, we need to make sure we don’t mistake “funding success” with “project success”.

Seriously, this is really, really important.

Funding success is, like, maybe the third or fourth step in a project — far from the final one. Project success is what really counts. You have to do the work. You have to deliver your product. Only then can we decide whether the project was a success.

Yeah, it’s really cool that people liked your idea enough and in such numbers that you got to raise enough money to hit your goal and actually collect that money. Don’t you dare think of the Kickstarter as “successful” at this point! The project is only beginning. When you deliver the product that you promised, then you can make a claim to success.

But finishing isn’t even success. Not really. If you completed the project, but went way over budget, or delivered so late that no one cared and everyone now hates you, your Kickstarter won’t be remembered as successful. If the end results are of poor quality, no one will call that successful. If you don’t set yourself up for your next successful project by building on the success of the last successful project, whatever success you do attain will be quickly forgotten.

It’s only natural for people to celebrate reaching an important milestone, but don’t confuse your funding milestone with the finish line. Stop calling funded Kickstarter projects “successful” until they are.

If you don’t? Well, you’ll be deluded. And the project owners will be deluded people with a big pile of money. And big, probably fragile egos.

You’ll feel like you had the meal when you merely looked at the menu. Getting your money up front, I’m sure, feels wonderful. But don’t let it go to your head. You need to show us results. I worry the exuberance everyone feels from a project getting successfully funded will make people forget about delivering the results and making a successful product. The focus will be on the run up and the party that happens when the “success” of reaching the funding target happens. There’s a long, not very sexy period of working your ass off that comes after this point, and if you allow yourself to get too high on the “success” of having all that money you said you needed to attain your dream, you might just forget about the dream.

And then we’ll have scandals and repercussions. And the good will of the crowd will dry up. You don’t want to ruin that trust, because once it is ruined, regaining it will not be easy. Please don’t diarrhea into that swimming pool full of money.

http makes us all journalists.

Something I did made the paper a year ago, and I just now found out about it:

http://blogs.sfweekly.com/thesnitch/2011/09/anonymous_circulates_diy_press.php

The blog article mis-attributes the quote “http makes us all journalists” to Peter Fein, who I’ve never met or had any dealings with, but I actually came up with the slogan, and the 1.0 design of the press pass. [Update: It turns out that I have met Pete, but it was long after the fact of the BART protests. In fact, without realizing that he was the same person from the article above, I met him and his wife Elizabeth at Notacon9. I’m happy to report that they’re wonderful people.]

After I created the design, and it was put up the noisebridge wiki, I remember at the time a number of journalists took offense to the slogan, completely missing the point of it while being defensive about their college major or profession. I guess it stings when your career is threatened by the emergence of a new medium that the old guard doesn’t understand readily, misses the boat on, and you watch newspaper after newspaper go out of business or get swallowed up by corporate media conglomerates.

I get that Journalism is a serious discipline and has standards, most of which are completely gone from broadcast and publishing these days, but whatever, they’re important standards. Saying “http makes us all journalists” wasn’t meant to insult your diploma, your profession, your Peabody, or your Pulitzer.

The point was that the slogan is directly after a quotation of the full text of the first amendment, which guarantees freedom of the press. The internet, especially http, enables all of us to be our own press. Freedom of the Press isn’t just freedom for Journalists, but for artists and authors and everyone who has a mind to express thoughts with. “With HTTP, All Can Publish” might have been a more accurate slogan, but I came up with the idea in about 10 minutes, and I like the spirit of it, so I’ll stick with it. Frankly, I’d rather there be more agitated journalists in the world, rather than the corporate media shills that have largely supplanted them, while abdicating the Fourth Estate for a comfy paycheck. If you’re a journalist and the slogan pissed you off, good. If it inspired even one person in the general public to take up the mantle and aspire to become a serious journalist, even better.

I created the design when a friend in SF tweeting about the Bay Area Rapid Transit protests, that were happening at the time, said that people who didn’t have press passes were being denied access to the protest area. The protests were in response to a police shooting and killing of an homeless man who was on the prone on the ground at the time he was shot, and not a threat to anyone. That shouldn’t have happened. I felt strongly that the protesters had a right to protest and a right to cover their own actions and publish about it, so I created the press pass. It took maybe a half hour, a couple of rectangles in Illustrator, and I was done. The idea that the right to be present to cover an event should be limited to those who possess a Press pass struck me as an unconstitutional abridgment of rights reserved for all. So I created a Press pass for all.

The version I created didn’t have the photo of the Guy Fawkes wearing person in the ID photo. My idea was to take your own passport photo and put it in there — I measured everything out carefully to be sized correctly, and made the card the size of a ISO spec for an ID card that I found details on the internet.

The image at the right is the symbol for Noisebridge, a SF hackerspace that I’d like to visit someday. I ganked the image from their website and incorporated it into the design, since the friend who got the idea started was affiliated with them, and they were involved to some capacity in organizing the BART protests. I’ve met some cool people from Noisebridge who I consider to be good people: bright, conscientious, inquisitive, concerned.

The reverse of the press pass had the text of the First Amendment and the slogan “http makes us all journalists” which I meant to emphasize the fact that the internet is a truly democratizing force, enabling each and every one of us to communicate with everyone else, reaching people we might never otherwise have known about, and impossible to censor… though, they never do quit trying.

Someone else put the Guy Fawkes image in there, but you could just as well replace it with your own image if you wanted, as I originally intended. The “points system” for doing this or that with the pass to make it more authentic looking was also someone else’s idea, as was the information resources to help people know their rights. Each contributor acted freely of their own accord to contribute their ideas and built off of them without ever talking to each other. It is what you make it. Modify to suit your needs. Do what you want, be responsible for what you do. That’s the power true freedom gives you.

I’m not a member of Anonymous, as I’m not posting this anonymously. Anonymous does some good, some bad, just like anything else. I don’t know anything more about them than what you can read on the internet.

###

More favorable coverage, with an image of the original design:

http://lafiga.firedoglake.com/2011/08/23/http-makes-us-all-journalists/

The Great Google Privacy Policy Consolidation

A friend of mine asked recently:

Hey Chris –

I have a question and figured you might be a good person to ask – this is regarding the Google privacy policy.

I do not have a gmail or google + or youtube account. Do I need to do anything for privacy protection, then? I do use google as a search engine for documents and images. I also use youtube.com, but just as an anonymous user without an account. Should I try to erase my browsing history? I do that anyway with my isp, but since I don’t have an official google account, do I need to worry about any of this stuff?

Thanks, Chris!

Ironically, this was on Facebook, but it’s still good to at least be concerned about privacy, right? I figured the reply I gave them was blog-worthy, so I treated it as my first draft, re-worked it a bit, added some more thoughts, and embellished.

Here’s what I said:

Ultra-short answer:

We’re screwed no matter what we do, so don’t worry about it too much.

OK, maaaaybe “screwed no matter what” is overstating it a bit, but I don’t think so. We really have very little recourse or power over how information about us is used. I suppose I could rephrase it, “We’re at their mercy no matter what.” and be slightly more accurate, but I suspect it’s just semantics at that point.

Why do I say this?

What meaning is there in a privacy policy? A privacy policy is basically a token offering of transparency, intended to show that the web site is acting in good faith to try to make it known what they will do or not do with information that you give to them.

How do you know if they act according to policy? Generally, you don’t. It’s possible you might catch them slipping up if they do something really dumb. What then? They issue a [lame] apology, the news media forgets the whole thing in a day or two.

What recourse do you have if the violate their own policy? I dunno, maybe sue them?

They can change the policy at any time to whatever they want it to be, but they already have whatever information you’ve given them, and it’s fairly reasonable to assume that they always will have it. It’s not good enough to have an acceptable policy now, if they can change it to an unacceptable policy later.

Mind you, that information you provide to them is not just the explicit, deliberate information you give purposefully, such as your user profile information. It’s also information you unconsciously provide, that they can gather from your actions on the site, such as you have a tendency to click on links that look like they might take you to pictures of boobs, or whatever. We betray ourselves constantly by doing and being ourselves and being observable.

A privacy policy is only as good as the integrity of the issuer. Policies change over time, usually without as much notice or forewarning as Google has given. When they do change, I’m always reminded of the scene in Empire Strikes Back when Darth Vader tells Lando Calrissian that he’s changing the deal.

Darth Vader: Calrissian. Take the princess and the Wookiee to my ship.
Lando: You said they’d be left at the city under my supervision!
Darth Vader: I am altering the deal. Pray I don’t alter it any further.

A privacy policy isn’t a contract. It therefore isn’t binding.

Even if a policy were binding, that policy can become null and void if the company gets acquired by another company, particularly if they go bankrupt, or if the company is forced by legal proceedings to divulge information. When a company gets split up and its assets become the property of its creditors, those assets include information about you, the user. The creditor isn’t bound by the policy, and is beholden to its investors to maximize the value of the assets it recovered from the bankruptcy. Chances are, that means your information is going to get used in ways you probably wouldn’t like if you knew about it or could do something to prevent it. Your only real hope is that the creditor cares about public opinion about it. Which, it might realize it does, but only after the fact, when it is too late to prevent the harm that violating your trust has caused.

Privacy policies also do nothing to protect you against external abuse of the service, ie “hacking”. If the service experiences a data breach, your data is being used in ways you don’t want, but the policy does nothing to prevent this or protect you. You might be able to sue, if you have the time and a good lawyer, and, if they were hacked due to willful negligence, you might even prevail in finding them liable for damages, although most likely, their Terms of Service that you agreed almost certainly indemnified them. But even if you win, and are awarded damages, that still doesn’t redact the information that’s now out there.

All of this background is pretty far afield from the specific question about Google’s privacy policy consolidation. But I think it’s the most germane thing to say about the matter, because, ultimately, privacy policies are pretty useless, meaningless things.

I’m not suggesting that Google doesn’t follow their privacy policy, or that their policy is bad, I’m just saying that policies are like promises that corporations make at their convenience, and change as suits them. So, not really promises.

Now, keeping that in mind… let’s talk about Google.

Short answer:

  • If you do not have any google accounts, you are relatively safe, and the policy changes don’t really change anything for you.
  • If you do have accounts with google, and are not logged in, you are relatively safe, as long as you always remember to log out whenever you don’t want your usage of google to be tied to an identity that you use.
  • What you do when you’re not logged in, won’t be explicitly connected to your google identity.
  • However, that’s not to say that your activity can’t be traced to your identity with a little effort. Your activity will assuredly be logged, and, combined with other information, that your computer or browser reveal about you, such as your IP address, geo-location, cookie information, your browser “fingerprint”, usage patterns, analyzing your online friends and contacts, the way you misspell words, your writing style, could all potentially be used to identify you even if you’re not giving away your identity explicitly by being logged in.
  • Google (as with any web site) can still track what visitors do when they are not logged in, but these behaviors are not explicitly tied to an identity. It’s not difficult to infer an identity of an anonymous web visitor using various techniques, given enough collected information to establish behavior patterns.
  • In fact, most web sites (including this one) use a Google product called Analytics to help them accumulate stats about the use of the site. This sort of information is pretty harmless, it just gives visitor counts, search terms used that lead someone to your site, what time of day people visit, how long they stay, where in the world they are visiting from, and that sort of thing. I wouldn’t call myself an expert, but I don’t see much potential harm in this sort of information being collected. Still, there are concerns, since other web sites using Analytics effectively multiplies Google’s reach.
  • If you use the Google Chrome web browser, or an Android phone, they absolutely do track usage, anonymously or not, and even if they don’t care who you are, specifically, they’re getting a pretty good picture of it anyway. Google most likely will not do anything with it beyond help advertisers find you so they can sell you things that you’re more likely to want to buy. That’s not to say that they couldn’t decide to use the information in other ways, if they wanted to, though. Some people in the know have said that the entire point of Chrome and Android are to gather information about their users for google’s gain.

One of the main things that people are concerned about is that their google search queries, youtube viewing history and favorites, which they had long thought were private, would be linked to your identity, and that this link would be made public through Google’s new social features.

Google has always made search trend data (aggregated statistics about supposedly-private search terms) public. That’s how we knew during the 90’s that everyone was searching for Britney Spears, remember?

What’s new is their integration of search with their new “Google+” identity service. Social search is supposed to help you find stuff that’s more relevant to you by telling you what your friends +1’d. This is great until you discover that one of your friends has some disturbing interests, and that gets you to wondering what interests you have that others might find disturbing. Anything you publicly +1 is visible to the internet at large as something you “liked”. There is a natural inclination to interpret a +1 or Like as endorsement, regardless of whether you actually agreed with it, or laughed at it, or hated it, or just thought it was interesting. It’s disturbing to most people to think that others viewing might jump to conclusions about who you are, based on the things you +1.

If you don’t like this, there are other search engines you can use, such as duckduckgo, which promise not to track you at all. Again, this is nothing more than a promise, and you really don’t know whether they do or not.

Google isn’t the only one who does this, of course. Facebook has infected virtually the entire internet, allowing you to “log in with facebook”, or “Like” anything and everything. This information is shared with your friends, with Facebook and Facebook’s partners, with the site who’s content you Liked or logged in to view. People “liking” stuff and sharing links with each other is how word spreads around and content “goes viral”. This is great if it makes you famous or puts public pressure on someone doing something we don’t like. But when it’s you doing something perfectly within your rights, and the public doesn’t like it, you can feel oppressed or threatened. Worse things than that can happen, too. You can lose your job, get arrested, lose friends. Your whole life can be ruined.

And for all that, it may be that this new social aspect of web searching is more useful than it is harmful, that on the balance it is a net good, albeit with risks and drawbacks. One benefit of public social search is that it makes it easier for you to find content that is relevant to you, and to share that content with your friends. Content your friends like is very likely to be of interest to you, so weighting a search result that has been “+1’d by someone you know” makes a great deal of sense. And, as long as the friend +1’d it knowing that their +1 would be used as a recommendation this way, it’s all well and good.

Webmasters are always clamoring for better rankings in Google’s search engine so they can get more traffic as a result. As unscrupulous sites learn to game the system, through exploiting principles of SEO to attract traffic “undeservedly” by not providing what that traffic is really looking for, thereby wasting eveybody’s time in order to reap ad revenue, Google continually has worked to refine PageRank to keep its results relevant and keep spam down. Social bookmarking is merely the next iteration in that arms race. The countermeasure, of course, is also already here: advertising campaigns which bribe you into liking or +1-ing pages in order to get points, a discount, a chance at a prize. And so it goes.

Another potential problem is that your favorite service may end up being acquired by one of the behemoths. Yahoo! loves to do this and usually screws their users in various ways. Google does to, but is usually better about preserving the quality and value of user experience. All the big players play this acquisition game to some extent. So, if you think you’re safer using a smaller web site that promises they’ll never sell you out to third parties, remember the promise is only as good as their word, and only good as long as they exist as themselves, and tomorrow they could change their mind, get acquired, or get served a subpoena. It could happen to DuckDuckGo just as well as it could happen to anyone.

Why the consolidation? What’s the problem?

I think that consolidating privacy policies and making them more consistent across the services that google offers is generally a good idea and makes sense. Over the years Google has amassed a considerable number of online services, and tying them together rather than having dozens of separate policies and keeping information about how you use each service separate doesn’t make a great deal of sense.

I think it’s to Google’s credit that they’ve been forthcoming about the changes and actively promoted what they are doing, to keep things as transparent as possible. Google does listen to user feedback and tries to do the right thing, although of course not everyone agrees that they always do.

Nevertheless, it is understandably disturbing is the concentration of the information those services collect about you, and what can happen when information from an account you created to shield your identity via pseudonym catches up with you and is linked with your “true” identity.

If you have a persona on one service that is very different from your “normal” self, it can be embarrassing or damaging for people who know you in one world to suddenly find out that you also live in another world as well. There are legitimate needs people have to compartmentalize their lives in this way, and it shouldn’t be google’s place to judge or to decide for them.

I really don’t think that they do judge, but they do seem to be deciding a bit, by linking services this way. If you thought me@gmail.com and me@youtube.com were separate, that’s probably a misconception that you bear responsibility for; you could have created separate accounts, myemailforveryseriousbusiness@gmail.com, and ilikewatchingfunnyvids@youtube.com. It’d become a pain to log out of one and into another each time you wanted to visit a site, but at least you’d have your e-life compartmentalized.

The concern with this consolidation is that, now there’s potential for inadvertant slips of information, now that your email usage data is tied to your youtube usage data and potentially becomes visible to everyone with a Plus account whom you’ve ever added to a circle, or even the public at large. Now the company you’ve emailed about a job you wanted knows you enjoy watching videos of cats doing cute things, or that you’re an ardent environmentalist, or a gun nut, or think recreational drugs should be legalized, or that you oppose war. Oops. People are really more worried about being judged by others, not just by Google.

What do do?

Anonymity

Be anonymous as much as you can. That means don’t log in. When you do need to log in, use https and other encrypted protocols as much as possible (sftp, ssh, etc.) Https is a good idea even for general browsing when you’re not logged in. Use Tor. Encrypt your email.

Unfortunately, so much of the web now depends on you being logged in, or identifying yourself somehow. To access content, to share it with your friends, to comment, to purchase. Sooner or later, you’re going to need to log in.

Pseudonyms:

A simple solution to this is to use pseudonyms. Use myrealname@gmail.com for official business, and iloveporn@gmail.com for your nasty business. Don’t mix the two up, and don’t let your porn-loving pals know what your real name is. Have as many pseudonyms as you think you need, to keep distinct your various identities separate and segregated to whatever communities you choose to use that identity for.

Is it possible to somehow establish that there is a link between the user of your pseudonym account to your other account, or to your real identity? Sure. But that’s more something a private detective or law enforcement official might try to do, not something Google’s terribly interested in doing. Although, if Google wanted to, it’d be terribly trivial for them to do that.

Is it possible to screw up and accidentally send that email to Boss@work.com from the iloveporn account? You better believe it. Be careful.

A pseudonym is something you’d use for relative anonymity, but where you still need an identity that persists over long term, so that other users of a community can have some sense of “knowing” who you are.

Throw-away accounts

If you’re more worried about your activities being traced or tied to you in any way at all, it makes sense to create and dump accounts for specific, short-term purposes. Throw-away accounts can help a little by compartmentalizing information about you and keeping the amount of information gathered on any single account to a minimum. Each time you start over fresh with a new account, it’s as though you’ve thrown away your past information, so long as it cannot be tied to your real identity(-ies), or your other throw-away accounts.

If you ever use an account to do something you don’t want traced back to you, use a throw-away account, use it for one thing and one thing only, discontinue using the account as soon as possible, and delete the account if possible once you’re done with it — not that this will delete the data they’ve collected, but it will prevent you from re-using the account again and adding to the data trail, thereby limiting what they can acquire about you with that one account.

If you’re ultra-paranoid, use the account from a public wifi access point, using a clean-installed OS and browser with no special customizations. What are you doing, anyway, issuing death threats?

Yeah, I went there. The assumption generally will be that you’re up to no good if you’re going to that extreme. Not, for example, that you live in Syria or North Korea, and this is what you have to do if you want to live.

Privacy enemies love to brand people who take unusual measures to protect their privacy as deviants who have something to hide, likely pedophiles or terrorists. They don’t think about the French Resistance during World War II, or 1984. Unfortunately, this means that if you are one of the few people who does use a lot of privacy protecting countermeasures, you’re making yourself visible in a way that could arouse suspicion.

The only hope here is to get everyone to adopt privacy technology, which is a decidedly uphill battle. The average person knows little and cares less about how vulnerable their information is, and has a hard time understanding the threat picture or how to protect themselves. Unless privacy security is built in at the protocol and application level, and is thus on for everyone by default, the vast majority of users aren’t going to use it.

Should I delete my history?

Erasing your browsing history won’t really help all that much. If you erase it, you erase YOUR copy of it, and thereby deny access to it for people who have access to your PC, either direct physical access, or through malicious web sites that may be able to exploit a vulnerability to read cookies set by other web sites, view your history or access your saved passwords, or who knows what else.

I find local history useful to bring back something I saw recently and want to go back to for some reason, and it helps me feel like the computer is mine when it “knows” me.

Still, if you’re worried about someone snooping on your PC, erasing your history can be a sensible thing to do.

However, on the server side of the web, there will be a log of your access and what actions you performed through the browser while you are connected to that site, and that isn’t something you can delete. Even if the web site offers you the ability to delete your information, it’s entirely likely that all that does is hide the information from you, while keeping it for the use of the service, for data mining, reselling to third parties, and what have you. When it comes to “removing” data, there’s “remove permissions”, there’s “removing a softlink to an inode”, and there’s “rm -f”. Even if a web service did offer “rm -f”-level deletion of your data at your request, deleting is still legitimately hard — if you expect your data to be purged from all backup tapes and whatnot, forget about it. Ain’t happening.

What do they want from me?

It’s easy, and understandable, to feel paranoid about all of this. As the saying goes “Just because you’re paranoid, don’t mean they’re not after you.” But the inverse is also relevant: Just because they’re not after you, specifically, doesn’t mean you can relax about your paranoia. “They” are after everyone.

Most of it does not have anything to do with you as an individual. I mean, sure it’s possible that a person who has enemies could have this information gathered and used against them, but the world generally is not really that interested in any one person. If you’re a fugitive, or should be if people knew more about what you do with yourself, that’s another matter.

The biggest use of this information is to help target you with advertising that you’re more likely to respond to. Targeted advertising can actually help you — for example by informing you of a product you would like but don’t know about, or by steering discounts your way for things they know you like. I really, *really* hate advertising, but I do actually like it when I want to buy something, start searching for it, and a few days later start getting targeted ads for that thing, offering me discount incentives for it.

I suppose there’s the potential for mind control, brainwashing, and pavlovian conditioning. We are, after all, animals. We don’t like to be controlled or manipulated, and we know we are vulnerable to it. And advertisers want us to spend our money on their stuff. But, the deal is, if they know who you are better, then maybe they can sell you things you actually want and need, and maybe they really don’t care about your private business. As long as the ads aren’t annoying and in your face, I don’t mind them so much, but if they diminish my experience of using a service, I feel it’s my right to block them. They appear on my computer, which after all, I own and control.

But there’s legitimate worry, that this information can be used in ways that harm us, as when insurance companies learn more about who you are and decide you’re more costly to insure or are uninsurable, or if the government starts to suspect that you’re an enemy of the state, or a corporation determines you to be a threat of some kind, and won’t hire you.

Where, then?

Even if you are really worried about Google’s privacy change, and all this general internet privacy paranoia talk has got you thinking about ditching the internet, unplugging entirely from the net is only going to help you so much.

There’s so much information gathered about you and shared by those who gather it that they can pull up a pretty good picture of who you are.

If you have “membership” or “discount” cards with businesses, if you use credit cards, if you utilize financial products from lending institutions, if you tend to respond to surveys, if you file taxes, if you’ve lived in the same place for a while, if you haven’t changed your name recently, they have a lot of info on you already. No matter what you do, it’s possible for people to collect information about you if they can “see” you. Once a bit of information exists about you, sharing that information is trivial. It sticks around forever. And it can be combined with other little bits of information about you from all over the place. And an institution with time on its hands and a lot of resources can amass a staggering amount of information about you.

Scary stuff, but good luck fighting against it.

That’s why I say we’re all screwed no matter what, and not to worry about it too much.

Why do I say don’t worry about it too much? Well, if you want to keep your private stuff private — and there is still stuff that we legitimately ought to want to be able to keep private — at the moment it’s a bit of a losing battle. But, the upside of this is that as more and more stuff that we used to keep private becomes exposed, we’re going to find that we had less to fear.

When I said “good luck fighting against it,” a moment ago, I meant “good luck fighting alone to keep your private stuff private.” That doesn’t mean that we’re all completely powerless.

Once you’re outed, you’ll find that there are lots of people like you. And you have strength in numbers. Thinking about people and their secrets, I find it comforting to think about what the gay community has been able to do in the last 50 years to assert their legitimate right to exist and enjoy the same freedoms everyone else gets. They still struggle for acceptance, but just look at all the progress that has been made.

Live the life you want to live, not the life you’re afraid not to live because of what you think others will think of you, not even people in positions of power, who might abuse that power. The best defense against this sort of abuse, in my opinion, is openness. If lots of people stand up at once and assert their rights, they can win them, keep them, and have them. Bad things can, and, I’m sure, will happen to people, and I don’t mean to justify it or minimize it. But at this point, I think we’re better off standing up for ourselves, fighting back, and asserting our rights than we are trying to hide and exercise those rights unnoticed.

Managing Categories and Tags in WordPress

For the longest time, I’ve paid little attention to the categories and tags on this site. I played with the features a bit, but didn’t really understand them well enough to feel like I knew what to make a category, what to make a tag, how to do it consistently, and so on.

As often happens, I figured it out “naturally”, by just using the site and over time the purpose became more clear. Then for a long time I just didn’t feel like going through the tedium of going through all the old posts and re-doing everything. I hated feeling like “If I had to do it all over again, I’d do things differently”, though, so eventually I had to do something about it.

I’m here to share the lessons I learned.

Know your purpose, or if you don’t know your purpose, find it

When I started this site, I wasn’t entirely sure what I wanted to use it for. I knew I wanted it to be a site for promoting and blogging my professional activities, but beyond that I wasn’t sure how I wanted to do it. This was something that developed for me over time, as I became more comfortable. At first I was very risk averse about putting up any content at all. Putting my real name up on the web made me feel inhibited and over-cautious. I didn’t want to make a mistake, embarrass myself, offend someone, lose my job, etc.

As time went on, I began to get over these fears, and it allowed me to post more frequently, feel more free about saying what I want to say, and knowing what I wanted to talk about. I surmise that most web sites develop their purpose over time, and refine what they do. I couldn’t have known how to do everything before I started.

Doing it is an essential part of the process of learning how to do it.

This means making mistakes, and you shouldn’t let yourself be inhibited from making them. Learning from them quickly and doing things better is more important. But sometimes lessons take a while to sink in, and when that happens it is not always the best thing to start making changes right away. You don’t have the time and you quickly lose energy if you put yourself through a comprehensive overhaul several times in quick succession. So before doing a drastic overhaul, take time to think about it, and before you do the whole thing, do a small part of it first and see how it works. Iterate a few times until you think it’s just about right. Then do the overhaul.

Categories

Here’s how I think about WordPress Categories: If my WordPress site was a book, the Categories would be the headings I would use for my Table of Contents. This isn’t quite right, but it’s a close enough way of looking at it.

If your site has a relatively narrow purpose, you should have relatively few categories. Categories should be broad. Think of your categories as sorting bins for your posts. Your posts fit into or under them. It’s OK if your posts fit into multiple categories, since there’s often overlap. You can create a hierarchy of categories as well, which can be helpful if you have a number of closely related category topics.

If you find that you are constantly writing posts that fit into the same group of categories, you should think about whether those categories would be better off consolidated into a single, broader category, and perhaps your former categories re-done as Tags.

Tags

Tags are like index keywords that help describe the major ideas that are contained within your post. You should think about the content of your post, and what the main ideas or topics were, and tag appropriately. This is not a SEO game, where you want to try to guess all the variations of words that people search by and include them. So skip the -s/-ing/-ly game.

Tags should be short, single words or phrases of two or three words. Try to avoid redundancy, but some small amount is probably OK. WordPress separates tags with commas, so you don’t have to worry about using spaces. It’s OK to use spaces between words, rather than running words together.

I frequently see tags being misused as a sort of meta-commentary on the content of the post or page. This is witty, entertaining, gives some personality to the site. I’m not sure that it’s helpful, but the occasional humorous tag might be amusing.

Witty tags work when you’re reading at the bottom of a post, or reading the summary or digest of an article before you click to Read More. But the intended way for your readers to use tags is to find other related content on your site that is of interest to them. If you over-do the witty tags, you’re not giving the reader useful ways to find a reason to spend more time reading your site.

How your site’s users use Categories and Tags

How, indeed? You can guess, and you can assume, but the truth is unless you have some system of measuring that can watch your readers behavior while they’re on your site, you don’t have too much of a clue how a site’s users actually use the category and tag features.

With WordPress sites, typically it’s the authors who are doing the tagging and categorizing. Readers merely consume them. Some sites, where there is an element or even an emphasis on user-generated content, give users the capability to creating their own tags and categories. If your site does this, you absolutely need to observe and track your users’ behavior. It’s fascinating, amusing, and will give you a lot of insight.

If you retain sole control the category and tag features, you need to think about what your readers need and how useful you are making your site through these features. If you can, try NOT to have to rely on guessing or “common sense” to tell you this — find ways to observe user behavior (though logging, perhaps), or solicit user feedback, and use that to influence your planning and decisions.

Another useful thing to do is to monitor the way people are searching your site, or the search engine query that brought them to your site. The most common search terms your users used to find you should jump out as terms that you should use for tags, possibly for categories as well. And if you’re advertising your site, or using advertising to generate revenue on your site, knowing what terms users are searching for is crucial to drawing traffic and generating revenue.

WP-Admin and the Category/Tag Renovation

My experience with this was that it could have been faster and less tedious. It’s probably my host more than anything, but it seemed that reloading the post, tag, and category administration pages took longer than I had patience for. Clicking update, then waiting a few seconds for the refresh, times however many posts I updated, adds up.

If I wanted to apply the same changes to multiple posts, there’s no way to do this through the web interface. A “mass action” feature to allow adding/removing the same category or tags to multiple posts at once would be very useful.

I could have attempted to directly manipulate the database through building a custom update query, but I didn’t want to sink time into doing that, didn’t want to run the risk of messing it up, and in any case, it’s probably beyond the capability of most WordPress bloggers, so I don’t recommend it. If you have an absolutely HUGE site that needs hundreds or thousands of changes to be made the same way, look into it. If you’re just dealing with dozens, just do it manually.

The other thing that would have been helpful was some kind of redundant tag merging. It’s not uncommon to apply very similar tags inconsistently over the history of your site.

For example, I used the tags “GameMaker” and “Game Maker” quite a bit. I had a few other GameMaker-related tags, which included a specific version, such as 8.0, 8,1, etc.

My first attempt at merging these was to simply re-name the “Game Maker” tag to match the label of my “GameMaker” tag. This did not merge the tags, though; it just created two identical tag labels, which were still separate as far as my WordPress site was concerned. A reader clicking on the “GameMaker” tag from one of my posts would only find about half of the posts I’ve written about Game Maker. Not good!

In order to fix this, I had to remove the redundant tag from my tagging system. To avoid losing the posts that I wanted to be tagged, though, I had to go through and re-tag those posts with the correct tag. At that point, I had a bunch of posts that had BOTH “GameMaker” tags — the correct one, and the incorrect tag that I’d re-labeled. I still needed to remove the incorrect tag to get rid of the redundancy, but looking at my Posts I couldn’t tell which was the redundant tag! So, I went back to the tag admin page, and changed the label of the incorrect GameMaker tag to “dup”, and then went through my posts and removed the “dup” tag.

It would have been much simpler, easier, and faster, if I could have simply navigated to the tag admin page, selected both the “Game Maker” and “GameMaker” tags, hit a button to merge the two tags, and specified which label I preferred to keep. I hope they include that feature in a future WordPress release.

Conclusion

I’m sure there’s still more room for improvement with the way I’ve done it, but I’ve managed to clean up my categories considerably, and applied tags much more consistently through all of my posts. It took a couple hours, but I hope it is worth it. I see a few benefits worth mentioning:

  • Users will have an easier time finding content that is relevant to their interests or related to something they came to the site to read.
  • It will increase the amount of time users spend using the site.
  • It will decrease the amount of time users waste on the site.
  • Better organization will convey to users that the site is of good quality.

Bad Google Chrome 17: What happened to Don’t Be Evil?

I just read this Ars Technica article on the Google Chrome 17 release and was not happy to read the following:

The new Chrome introduces a “preemptive rendering” feature that will automatically begin loading and rendering a page in the background while the user is typing the address in the omnibox (the combined address and search text entry field in Chrome’s navigation toolbar). The preloading will occur in cases when the top match generated by the omnibox’s autocompletion functionality is a site that the user visits frequently.

I bet this is going to piss off a lot of web server admins. Unless the pre-render is coming from Google’s Cache, it’s going to put extra load on web servers. Web server stats will be inflated, giving a distorted picture for ad revenue. I’m sure google’s smart enough to have thought of these things and has it all figured out, but I’d like to know what their answers were.

Google has also added some new security functionality to Chrome. Every time that the user downloads a file, the browser will compare it against a whiltelist of known-good files and publishers. If the file isn’t in the whitelist, its URL will be transmitted to Google’s servers, which will perform an automatic analysis and attempt to guess if the file is malicious based on various factors like the trustworthiness of its source. If the file is deemed a potential risk, the user will receive a warning.

Google says that data collected by the browser for the malware detection feature is only used to flag malicious files and isn’t used for any other purpose. The company will retain the IP address of the user and other metadata for a period of two weeks, at which point all of the data except the URL of the file will be purged from Google’s databases.

I sure hope this can be disabled. For one, whitelisting download files is the first step to a censored net. Secondly, it gives google access to anything you’ve ever downloaded. Your privacy is no a matter between you and the server. Now you have Google acting as a nanny, reading over your shoulder, making sure that what you’re pulling down over your network connection isn’t going to hurt you (but also very likely in time that it isn’t “bad” in any other sense, either).

While they’re “protecting” you now, eventually they’ll get the idea that they should “protect” you from copyright violation, from information the government doesn’t want you to see for whatever reason, and so on. It puts Google in control over how most people access everything on the internet, and is vastly more power than any single entity should be entrusted with, no matter how competent, how corruption-resistant, or how well-intended they are.

I’m sure malware is still a very real problem, but personally I have not had a run-in with Malware on any computer I’ve used in many years. Justifying Google’s right to do this and using malware as a scapegoat is a bit like saying that due to the possibility of terrorism, you have no right to personal privacy or a presumption of innocense.

We need to speak up about this.

Follow the Leader: Firefox 5 and the State of the Browser Wars

Mozilla released Firefox 5 yesterday. I upgraded on one of my systems already, but haven’t done so on all of my systems due to some Extensions that are lagging behind in compatibility. These days I mostly use Chrome as my default browser, so I’m less apt to notice what might have changed between FF4 and FF5, and looking at the change list it doesn’t look like a huge release, which is another way of saying that Firefox is mature and can be expected to undergo minor refinements rather than major uhpeavals — this should be a good thing. FF4 seemed like a pretty good quality release. I’ve been a Firefox user since the early 0.x releases, and have been more or less satisfied with it, whatever its present state was at the time, since about 0.9.3. And before that I used the full Mozilla suite, IE4-6 for a few dark years when it actually was the best browser available on Windows, and before that Netscape 4. I actively shunned and ridiculed WebTV ;-). And I’d been a Netscape user since 1.1N came out in ’94. So, yeah. I knows my web browsers.

These are pretty exciting times for the WWW. HTML5 and CSS3 continue slowly becoming viable for production use, and have enabled new possibilities for web developers.

Browsers have matured and become rather good, and between Mozilla, Chrome, Opera, Safari, and IE, it appears that there’s actually a healthy amount of competition going on to produce the best web browser, and pretty much all of the available choices are at least decent.

It seems like a good time to survey and assess the “state of the browser”. So I did that. This is going to be more off the cuff than diligiently researched, but here’s a few thoughts:

After some reflection, I’ve concluded that we seem to have pretty good quality in all major browsers, but perhaps less competition than the number of players in the market might seem to indicate.

Hmm, “Pretty good quality”: What do I mean by this, exactly? It’s hard to say what you expect from a web browser, and a few times we’ve seen innovations that have redefined good enough, but at the moment I feel that browsers are mature and good enough, for the most part: They’re fast, featureful, stable. Chrome and Firefox at least both have robust extensibility, with ecosystems of developers supporting some really clever (and useful) stuff that in large part I couldn’t imagine using the modern WWW without.

Security is a major area where things could still be better, but the challenges there are difficult to wrap one’s head around. It seems that for the forseeable future, being smart, savvy, and paranoid are necessary to have a reasonable degree of security when it comes to using a web browser — and even then it’s far from guaranteed.

There has been some progress in terms of detecting cross site scripting attacks, phishing sites, improperly signed certificates, locking scripts, and the like. Still, it seems wrong to expect a web browser to ever be “secure”, any more than it would make sense to expect any inanimate object to protect you. It’s a tool, and you use it, and how you use it will determine what sort of risks you expose yourself to. The tool can be designed in such a way as to reduce certain types of risks, but the problem domain is too broad and open to ever expect anyone but a qualified expert to have a hope of having anything resembling a complete understanding of the threat picture.

That’s a can of worms for another blog post, not something I can really tackle today. Let’s accept for now the thesis that browser quality is “decent” or even “pretty good”. The WWW is almost 20 years old, so anything other should be surprising.

In terms of competition, we have a bit less than the number of players makes it seem.

Microsoft only develops IE for Windows now, making it a non-competitor on all other platforms. Yet, because its installed userbase is so large, IE is still influential on the design of web sites (primarily in that IE forces web developers to test for older versions of IE’s quirks and bugs). By now, we’re really very nearly done with this, one would hope the long tail of IE6 is flattening as thin as it can until corporations can finally migrate from Windows XP. Even MS is solidly on board with complying with w3C recommendations for how web content gets rendered. It seems that their marketshare is held almost exclusively due to IE being the default browser for the dominant OS. Particularly in corporate environments where the desktop is locked down and the user has no choice, or the hordes of personal computer owners who own a computer but treat it like an appliance that they don’t understand, maintain, or upgrade. I suspect that the majority of IE users use it because they have no choice or because they don’t understand their computer enough or have the curiosity to learn how to install software, not because there are people out there who genuinely love IE and prefer it over other browsers. I’m willing to be wrong on this, so if you’re out there using IE and love it, and prefer it over other browsers, be sure to drop me a comment. I’d love to hear from you.

Apple is in much the same position with Safari on Mac OS X as MS is with IE on Windows. Apple does make Safari for Windows, but other than web developers who want to test with it, I know of no one who uses it. Safari is essentially in the inverse boat that IE is in on its native platform: a non-competitor on every other platform.

This leaves us with Opera, Mozilla, and Chrome.

Opera has been free for years now, though closed-source, and has great quality, yet adoption still is very low, to the point where its userbase is basically negligible. There are proud Opera fanboys out there, and probably will be as long as Opera sticks around. But they don’t seem like they’ll ever be a major player, even as the major players always seem to rip off features that they pioneered. They do have some inroads on embedded and mobile platforms (I use Opera on my Nokia smartphone rather than the built-in browser, and on my Wii). But I really have to wonder why Opera still exists at this point. It’s mysterious that they haven’t folded.

The Mozilla Foundation is so dependent on funding from Google that Firefox vs. Chrome might as well be Google vs. Google. One wonders how long that’s likely to continue. I guess as long as Google wants to erode the entrenched IE marketshare and appear not to be a drop-in replacement for monopoly, it will continue to support Mozilla and, in turn, Firefox. Mozilla does do more than just Firefox, though, so that’s something to keep in mind. A financially healthy, vibrant Mozilla is good for the market as a whole.

Moreover, both Chrome and Firefox are open source projects. This makes either project more or less freely able to borrow not just ideas, but (potentially, from a legal standpoint at least) actual source code, from each other.

It’s a bit difficult to be able to describe to a proverbial four year old how Mozilla and Chrome are competing with each other. If anything, they compete with each other for funding and developer resources (particularly from Google). Outwardly, Firefox appears to have lost the leadership position within the market, despite still having the larger user base, they are no longer driving the market to innovate. Firefox largely has given that up to Google (and even when they were given credit for it, much of what they “innovated” was already present in Opera, and merely popularized and re-implemented as open source by Mozilla. And with each release since Chrome was launched, Firefox continues to converge in its design to look and act more and more like Chrome.

It’s difficult to say how competing browsers ought to differentiate themselves from each other, anyway. The open standards that the WWW is built upon more or less demand that all browsers not differentiate themselves from each other too much, lest someone accuse them of attempting to hijack standards or create a proprietary Internet. Beyond that, market forces pretty much dictate that if you keep your differentiating feature to yourself, no web developers will make use of it because only the users of your browser will be able to make use of those features, leaving out the vast majority of internet users as a whole.

Accelerating Innovation

After releasing Firefox 4, Mozilla changed its development process to accomodate the accelerated type of release schedule that quickly lead to Google becoming recognized as the driver and innovator in the browser market. Firefox 5 is the first such release under the new process.

This change has met with a certain amount of controversy. I’ve read a lot of opinion on this on various forums frequented by geeks who care about these things.

Cynical geeks think that it’s marketing driven, with version number being used to connote quality or maturity, so that commercials can say “our version number is higher than the competitor, therefore our product must be that much better”. Cynics posited that since Chrome’s initial release put them so many versions behind IE/FF/Opera that this put Google into a position of needing to “make up excuses” to rev the major version number, until they “caught up” with the big boys.

While this is something that we have seen in software at times, I don’t think that’s what’s going on this time. We’re not seeing competitors skipping version numbers (like Netscape Navigator skipping 5 in order to achieve “version parity” with IE6) or marketing-driven changes to the way a product promotes its version (a la Windows 3.1 -> 95 -> 98 -> 2000 -> XP -> Vista -> 7).

Some geeks, I’ll call them versioning “purists,” believe that version numbers should “have integrity”, “be meaningful”, or “stand for something”. These are the kind of geeks who like the software projects where the major number stays at 0 for a decade, even though the application has been in widespread use and in a fairly mature state since 0.3 and has a double-digit minor number. The major release number denotes some state of maturity, and has criteria which must be satisfied in order for that number to go up, and if it ever should go up for the wrong reasons, it’s an unmitigated disaster, a triumph of marketing over engineering, or a symptom that the developers don’t know what they’re doing since they “don’t understand proper versioning”.

From this camp, we have the argument that in order to rev the major number so frequently, necessarily this must mean that the developers are delivering less with each rev, which thus necessarily dilutes the “meaningfulness” of the major version number, or somehow conveys misleading information. So much less is delivered with each release that the major number no longer conveys what they believe it ought to (typically, major code base architecture, or backward compatibility boundary, or something of that order). These people have a point, if the major number indeed is used to signify such things. However, they would be completely happy with the present state of affairs if only there were a major number ahead of the number that’s changing so frequently. In fact, you’ll hear them make snarky comments that “Firefox 5 is really 4.1”, and so on. Just pretend there’s an imaginary leading super-major version number, which never changes, guys. It’ll be OK.

Firefox’s accelerated dev cycle is in direct response to Chrome’s. Chrome’s rapid pace had nothing to do with achieving version parity. In fact, when Chrome launched in pre-1.0 beta, in terms of technology at least, it was actually ahead of the field in many ways. Beyond that, Chrome hardly advertises its version number at all. It updates itself in about as silently a manner as it possibly can without actually being deceptive. And Google’s marketing of Chrome doesn’t emphasize the version number, either. It’s the Chrome brand, not the version. Moreover, they don’t need to emphasize the version, because upgrading isn’t really a choice the user has to make in order to keep up to date.

Google’s development process has emphasized frequent, less disruptive change over less frequent, more disruptive. It’s a very smart approach, and it smells of Agile. Users benefit because they get better code sooner. Developers benefit because they get feedback on the product they released sooner, meaning they can fix problems and make improvements sooner.

The biggest problem that Mozilla users will have with this is that Extensions developers are going to have to adjust to the rapid pace. Firefox extensions have a built-in check which tests an Extension to see if it is designed to work with the version of Firefox that is loading it. This is a simple/dumb version number check, nothing more. So when version numbers bump and the underlying architecture hasn’t changed in a way that impacts the working of the Extension, the extension is disabled because the version number is disqualified, not necessarily because of a genuine technical incompatibility. Often the developer ups the version number that the check will allow, and that’s all that is needed. A more robust checking system that actually flags technical incompatibilities might help alleviate this tedium. But if and when the underlying architecture does change, Extension developers will have to become accustomed to being responsive quickly, or run the risk of becoming irrelevant due to obsolescence. Either that, or Firefox users will resist upgrading rapidly until their favorite Extensions are supported. Either situation is not good for Mozilla.

Somehow, Chrome doesn’t seem to have this problem. Chrome has a large ecology of Extensions, comparable to that of Firefox. Indeed, many popular Firefox Extensions are ported to work with Chrome. Yet I can’t recall ever getting warned or alerted that any of my Chrome extensions are no longer compatible because Chrome updated itself. It seems like another win for Chrome, and more that Firefox could learn from them.

I have to give a lot of credit to Google for spurring the innovation that has fueled browser development in the last couple years. The pace of innovation that we saw when Mozilla and Opera were the leaders just wasn’t as fast, or as influential. With the introduction of Chrome, and the rapid release schedule that Google have successfully kept up with, the entire market seems to have been invigorated. Mozilla has had to change their practices in order to keep up, both in terms of speeding up their release cycle, and in adopting some of the features that made Chrome a leader and innovator, such as isolating browser processes to indivual threads, drastically improving javascript performance. Actually, it feels to me that most of the recent innovation in web browsers has been all due to the leadership of Chrome, with everyone else following the leader rather than coming up with their own innovations.

In order to be truly competitive, the market needs more than just the absence of monopoly. A market with one innovator and many also-rans isn’t as robustly healthy as a market with multiple innovators. So, really, the amount of competition isn’t so great, and yet we see that the pace of innovation seems to be picking up. Also, it’s strange to be calling this a market, since no one at this point is actually selling anything. I’d really like to see some new, fresh ideas coming out of Mozilla, Opera, and even Microsoft and Apple. As long as Google keeps having great ideas coupled with great execution, and openness, perhaps such a robust market for browsers is not essential, but it would still be great to see.

Intellectual property value of social networking referrals

One thing I have noticed over my years of using the social web (fb, twitter, livejournal) that human culture instinctively places a value on linking to things in a way that I find odd. There’s a type of “intellectual property” that people conventionally recognize as a sort of matter of natural course. I don’t know how else to describe it than that.

In real value terms this sort of intellectual property is very low value, but in social etiquette terms, the value is more substantial. The phenomena is one of credit, but it’s not credit for authorship, rather it is credit for finding and sharing. If you find something cool and blog about it, and you’re the first one in your little social group to do so, you get some kind of credit for being on top of things, being cool enough to know where to look, lucky enough to be in the right place at the right time, or whatever. It’s not much more than that, but somehow if you post the same link and are not the first in your social group to do so, and don’t acknowledge the coolness of the person who you saw posted it first, it can ruffle feathers, as though people think you’re trying to be the cool, original one and are stealing other people’s “cool points” by not acknowledging where you got your cool link from.

It’s funny though since posting a link is an act of evaluation (“I judge this content to be worthy of your time, so I’m sharing it.”) rather than an act of creativity (if you want to be really cool, go author some original content and see how many people you can get to link to that.)

What I take from this is two things:

  1. having good enough taste in something to make a recommendation which one of your friends will pass along to others is an important, valuable thing in itself. Having this sort of taste implies that you are cool.
  2. Getting there first is important, OR perhaps acknowledging who was cool enough to turn you on to something that you found cool is important.

One of the things about Facebook that I like a lot is that they get this, and implement it in such a way that it basically works automatically. You can click “Share” and it just handles crediting who you got it from in a behind the scenes sort of way that forces you to follow the etiquette convention automatically, thereby avoiding being a leech or douchebag. On the other hand, in Livejournal, this is a somewhat useful way to discern who among your friends is a douchebag, since if they don’t think to credit someone for showing them something that you’ve already seen before, you know they’re not with it, or at least aren’t following their friends-list all that closely.

 

Another interesting thing about this is that, depending, sometimes people will just post a link to something without any comment, while other times people will post and add their thoughts to it as an annotation. Sometimes no comment is needed, or is implied by the context of how you know your Friend and what they are about and why they would be posting that link. Other times, people will post their thoughts and sometimes write something reasonably lengthy and thoughtful on the subject that they are linking to. This tends to happen much more on Livejournal than on Facebook or Twitter, which are geared toward more structured, but forced brief content. I think that Livejournal tends to encourage more expressive posts because people tend to use pseudonyms and write with somewhat more anonymity than they have with Facebook, where most people use their real name. I do like the way that Facebooks conversations of comments seem to flow very nicely once a topic hits someone’s wall. It’s also interesting to see how different groups of friends will come to the same original linked content and have different or similar conversations about it.

I think it would be fascinating to be able to visualize through some sort of graphic how sub-circles of an individual’s friends might converge though common interest in some topic. In my own Facebook experience, it has been interesting to see people I know from elementary and high school mixing with people I knew from college and afterward, and from various workplaces, and so on. I think it would be really interesting to see this sort of interaction on a very large scale, sortof a Zuckerberg’s eye view of what’s going on in various social circles that occupy Facebook. I can mentally picture colored bubbles occupying various regions of space, and mixing at the edges, colors blending like wet paint.

I also think it’s interesting how the constraints and style of the different social sites shape behavior and the characteristics of the groups who use them. Facebook users in my experience have tended to be more sedate, dryer, and thoughtful, though not always. Substantial numbers of my friends seem to be comfortable goofing and making fools of themselves, or being outspoken to the point that they run the risk of offending people of a differing political polarity. Twitter seems to be a land of important headlines mixed with one-liner witticisms and the occasional bit of Zen. Livejournal seems to be more private, insular, and diary-ish. I almost said “diaretic” but that sounds a lot like another word which, actually, might be even more appropriate, if disgusting. Discussting? Heh.

OK, I’m clearly blogging like I’ve been up for too long, and I have. But I hope to revisit and put more thought into these matters and see if something materializes out of that that is worthy of linking to and discussing. This could end up being someone’s Social Media studies PhD thesis:P

Three eras of searching the world wide web

A little late to the game and perhaps obvious, I know, but I was just musing and it occurred to me that there are perhaps three distinct eras for the way people using the world wide web have found information:

The Yahoo era: A cadre of net geeks personally indexed and recommended stuff for everyone to look at when you told them what you were looking for.
The Google era: A massive cluster of robots scoured the internet and figured out what web sites looked like they were pretty good and matched them up with what you told them you were looking for.
The Facebook era: Your friends find something cool/funny/useful/outrageous and post something about it, leading you to do the same.

Ok, so yes, that’s pretty obvious to anyone who’s been on the web and paying attention from 1994-onward or earlier. Predicting what the next era will be is of course the billion dollar question.

The obvious thing that comes to mind is that things will just remain this way forever, and of course this is false and just a failure of imagination.

The next most obvious guess at what the future will bring is to combine the stuff that happened in the previous eras in some novel way. The Facebook era is kindof like that — instead of a hand-picked WWW index managed by the geeks at Yahoo!, we have a feed (rather than an index) of links which our our social contacts (rather than a bunch of strangers working for Yahoo!) provide for us to check out.

So, perhaps just doing a mashup of the Facebook and Google eras would point to what the next breakthrough in search might look like. Let’s try that:

Mash1: Our social contacts create a cluster of robots who index the WWW and come up with a custom-tailored PageRank algorithm tied to what turns our crank.

Hmm, intriguing, but unlikely. Most of our social contacts probably don’t know enough about technology to do that.

Mash 2: The behavior of our social contacts is monitored by robots who analyze the information that can be datamined out of all that activity, and use it to beat our friends to the punch. Especially for marketing purposes.

Much more likely! What we’re doing on social networking sites is already closely watched and analyzed by hordes of robots. All it would take is for someone to come up with the idea and implement it.

And it’s a good enough idea that I bet there are already people working on this right now. In fact, there definitely are if you consider social media advertisers. But I’m also thinking about more general purpose informational search.

In fact, after I congratulate myself on what a clever prognosticator I am and hit Publish, I bet within 15 minutes someone will post a comment with a link to a company that’s doing exactly this.

I mean, of course I could save myself the embarrassment and google around and see if I could find that myself, but it’s so last-era.

I want to see whether the Facebook era will bring the information to me with less effort expended. It may or may not be faster than the google era, but faster isn’t always the most important thing — sometimes there’s a tremendous amount of value in getting information from a friend that could easily have been looked up through a simple query to google.

5… 4… 3… 2…

A few things you should know about SEO

I have a brother. He started a business earlier this year, and recently asked me about Search Engine Optimization (SEO) for his web site.

I went to a two-day class on the topic earlier this year, which means I’m by no means an expert on the topic, but I’m a pretty good study and I’ve been using and following the world wide web since very nearly the beginning. I figured I should answer his question, and while I’m at it I figure it’ll make a decent blog post.

So, here’s a few things you should know about SEO:

SEO is not a goal; it is a means to an end.

Everyone wants to be number one. But being the top ranked search engine result, or even on the first page, isn’t really the whole point of SEO. Getting that good position on a search results list is something we do for something. We do it in order to drive traffic to our site.

Depending on the site in question, simply driving traffic to it may not be the goal either. What do you want that traffic to do once it arrives at your site? That depends on the purpose of your site. Being found is only the first step. What will you do once they’ve found you?

There’s a zillion reasons people put web sites up, but most of them boil down to making money at some point. How do you do that with your site? Is it through advertising? Subscriptions? E-commerce? Establishing relationships with customers? Gathering user data and selling it?

However you do it, most likely the more traffic your site gets, the more revenue you’ll take in. Ranking highly for popular search terms is a good and important means of driving traffic to your site, but it’s not the only thing you can do to achieve that.

Whatever you do, don’t lose sight of why you want the traffic in the first place.

SEO is but one means of driving traffic to your site

Consider — and make use of — all methods that you can:

  1. internet advertising
  2. traditional media advertising (print, radio, tv, billboards)
  3. direct marketing (mailers, pamphlets, brochures, flyers, business cards)
  4. word of mouth
  5. other sites linking to you
  6. linking to yourself from elsewhere

Optimization is relative to the search term

People talk about “optimizing my site for search engines” and there are indeed a few technical things you can do with your site that will make it friendlier to search engines in general — and I’ll be getting to those.

But when you talk about SEO, you really are talking about optimizing for a specific term (or list of terms), not generically “optimizing your site.”

People searching for your site specifically are likely to find it very easily, even if they don’t know your domain name. Search any website for “csanyk.com”, for example, and you’re pretty much guaranteed that this website will be high on the list. Search for my name, and you’ll also find this site pretty high up on the list of results. Search for “IT consulting” or “Web Design” or something generic, and well, I’m sure I don’t rank so well. Another example: Last year, I created a class on Cascading Stylesheets called “Streetwise CSS”; if you search for that specific term, “streetwise css”, I’m highly visible. But if you’re just searching for CSS, I’m not as visible in the crowd of good resources on CSS that you’ll find when you google for the term “css”.

It’s easy to be found if you’re unique

The reason for that is simple: unique terms on the web don’t have to compete with a million other web pages because they’re unique. If there are 10 results on the first page, and you’re the only person who happens to be using that particular term in the entire internet, well guess what? You win by default. No contest.

Optimizing for a unique term is easy. It’s also a great idea. If you have one specific term, such as your name, that you can get out there through branding and marketing, people will start searching for that term and they’ll find you easily. But, the catch with unique terms for SEO is that since they’re unique, that means no one else is using them, and if no one else is using them, that’s probably because no one else knows them.

So one of the tricks of SEO is having a unique name or other term that could be used by people to search, but isn’t being used yet. Invent a good name that no one knows yet, make sure that you are on the top result for it on all the search engines, and then go about making it known. Youtube. Flickr. Pixlr.

You need to be where they’re looking

Having a unique, easy to find search term will rank you high on a search engine’s results for that specific term, but if no one’s searching for that term, it’s not going to boost your traffic. Cornering the search results market on a specific, unique term is not all there is to SEO. Far from it. You also need to try to get a piece of the action from very common search terms. SEO for a unique term is easy and valuable because it enables you to stand out from the crowd.

Ranking high in search engine results for very common search terms is much harder, but it’s even more valuable because it puts you in a prominent position in front of the crowd of people searching for that term, and the easiest way to draw a crowd to your site is to position it in the middle of a crowd to start with. It’s hard because there’s not just a crowd of searchers — there’s a crowd of sites looking to be found. And there’s only perhaps 10 results on the first page of most search engines, which they’re all competing for. Still, it’s worth competing for those high-ranking results for common terms, because so many people are searching for them.

To put it another way, looking for you is very different from searching for what you do, what you are, or what you want to be known for.

Learn to think like someone who’s searching for whatever it is you want to be found for.

You have to do some marketing research, use common sense, psychology, and come up with lists of terms that people who need your site are likely to be searching for. Figure out those words, and optimize for them, and your site will rank highly for people who need you. Set up Google Analytics on your site and you’ll be able to see the search terms people used to find your site. Look at that list, and start filling in blanks. Figure out synonyms, regional terms, alternate spellings for what you want to be known for. Let your list seed a brainstorm so you can come up with other terms that people might be searching on, but not finding you. Then optimize your site so that the next time someone searches for that term, they do.

Learn to think like a Search Engine Rank Algorithm (and a Web Crawler, too)

This is where your technical specialists come in. Whoever’s designing, building, and maintaining your web site should be taking care of this for you. But you need to know at least something about this, so you can talk to your technical people.

Understanding how a web crawler works isn’t difficult. A web crawler is a program that goes out on the web and downloads pages and follows links. That’s how search engines obtain the content that they index and rank. That’s about all you need to know. Knowing this, you now understand the importance of making the information on your website accessible to the web crawler.

There are ways of building web sites that make it easier or more difficult for web crawlers to find everything. Valid, standards compliant, well-structured HTML that does not abuse or misuse tags is what you want.

No one quite knows for sure how search engines rank sites for specific search terms — it’s a tightly guarded secret, and it’s constantly being changed and tweaked as the internet evolves and as SEO experts learn how to game the system. We can guess, and we have some pretty good knowledge about what matters to ranking algorithms.

Here’s where words “count” most to a ranking algorithm:

  1. the site domain name
  2. the title element
  3. heading elements

This does not mean you should load these areas with words you hope people will search on! Search engines are wise to this and will penalize you for it. The ranking algorithm has a built-in diminishing return for putting too many “hot words” on your page. Choose a domain name wisely, and go for uniqueness (since every common word is already in use or very expensive) and branding. Use the title and heading elements in your html to make good, effective titles and headings. Make them good and effective titles and headings for humans first, but give thought to the machines that will visit your site, analyze its contents, and then rank them for the search terms that those humans will be using to find you.

Avoid “hiding” your content where search engines won’t be able to find it:

  1. behind a login
  2. inside of flash objects that aren’t properly accessible
  3. in images without proper descriptive text

Ranking algorithms also care about how popular your site is and how important your site is. They also care about how popular and important the sites are that link to you. Just how exactly this is determined is difficult to know, but generally speaking, if other, high quality, popular, important web sites link to you, that will help boost you in search results. The more the better. But this is also very hard to accomplish. It’s probable that being popular with social networking sites will help boost your search ranking, but also drive a lot of traffic to your site through the users of those social networks outright. So make it easy for people visiting your site to Like you on facebook, to Tweet about something they found on your site, or to find your personal profile on LinkedIn, or whatever. There are social bookmarking plug-ins for most popular web content management and blogging systems (WordPress, Drupal, Blogger, etc.) that can do this for you. Giving your visitors reasons to Like and link to you is up to you.

Enough of these will give you some boost in your ranking. But links from social networking sites are also “cheap” and easy to game the system with, so it may be that the ranking algorithm takes this into account, or will soon. The maintainers of these algorithms are constantly changing the rules to keep ahead of SEO opportunists who are looking for ways to game the system.

Build a Good Site and SEO almost takes care of itself

Really, if you’re doing things right, you almost don’t even need to think about SEO. A good site is one that provides value to its visitors and gives them reasons to come back. This should be fairly obvious to anyone. You need content that is fresh, constantly updated. You need information that is highly valuable or entertaining. You need engaging things for people to do.

Have a strategy. Why does your site exist? What is its purpose? How well is it achieving its goals? Measure and monitor everything you can about the site and analyze it.

Not all web sites need to be YouTube or Facebook. You don’t have to be giant or mega-popular or have brilliant, cutting edge technology or an idea no one has ever thought of before to be useful or popular with your market.

If all your site needs to be is a brochure and a means for customers to find you, then provide useful information for customers and prospective customers. Give customers accounts that they can use to log in and conduct business with you — placing orders, paying bills, asking questions, providing feedback to you, Liking you.

If people visit your site frequently, they will link to it, tell friends about it, and this will build your traffic and search engines will rank your site higher as a result.

Don’t attract a crowd only to have them find an incomplete or crappy web site!