Disclaimer: this blog entry is concerned with certain aspects of natural language processing and automated text analysis and may therefore appear excessively nerdy to the non-initiated. Read at your own risk.
VBDU and MWTTR - don’t worry, those aren’t breakfast cereals, government agencies or contagious diseases.
Every once in a while, I feel the need to brush up on my programming skills. Lately, most of what I’ve been doing has been centered around writing human-readable text (as opposed to machine-readable text, i.e. code) and therefore I felt a little PHP practice was in order.
The result of yesterday’s half-day coding session is this script. Read on for an explanation of what exactly it does.
Thinking of what I could possibly code, I remembered an interesting paper by Eniko Csomay that I came across a while ago. In it, Eniko suggests a methodology for segmenting texts into smaller units according to their internal structure*. How can you teach a machine (even in general terms) where one section of a text ends and a new one begins? In her article, Eniko suggests the following approach: if a text moves from one part to another (say the transition from analysis to conclusions in a scientific paper) it is plausible that the lexical material used changes. To simplify a little, one section is likely to use a specific bundle of recurring words, while another will use different terms. Eniko calls these sections vocabulary-based discourse units (VBDUs) and she has shown that variation between VBDUs can be used to find topical and argumentative shifts in a text.
How does it work in practice? VBDUs can be measured by taking a snapshot of N words from a text and then comparing it with another window of the same size that follows the first one.
Let me give an example:
text = Mary had a little lamb, John had a little pony
window size = 5 words
window 1 = Mary had a little lamb
window 2 = John had a little pony
The calculated difference between the two windows equals 2, because John and pony differ from Mary and lamb, while the rest of the words are identical.
How can we calculate this variation for a text in its entirity? By moving through it, word by word.
If we move window 1 forward by a single word and do the same with window 2, the difference between the two windows may change. The example above isn’t terribly well-suited to demonstrate this, simply because the windows are very small, but if you boost window size to 50 or 100 words, you can get an idea of how this works.
Another thing that I decided to implement in my little script is a measure called Moving Average Type-Token Ratio (MATRR)**. The terms types and tokens are used in computational linguistics to differentiate between unique words and total words in a text. To use the example from above, the sentence Mary had a little lamb, John had a little pony consists of 10 tokens (actually 11 if you count the comma), but only 7 types, because the words had, a and little occur twice and we only count each unique word once when looking at types.
Comparing the ratio of unique words to total words is useful for several reasons. Generally, we can expect written texts which convey a lot of information to have a higher type-token ratio than (for example) spoken conversation, where certain material is likely to occur again and again (say, the pronouns I and you). This difference is not absolute, but there is a strong tendency for information-dense pieces of discourse (scientific papers, legal texts) to have a higher TTR than less dense material (casual conversation, probably most blogs).
However, there’s a minor methodological issue. TTR is tied to text length and tends to decrease the longer a text is - the amount of lexical material at our disposal is simply not infinite and therefore the ratio inevitably goes down.
The solution to this problem can be integrated into our approach to VBDU analysis: compare two windows, then move forward by a word and repeat the process.
Right, so what’s the result of all this? Lo and behold
Go ahead and try it. Simply paste a text into the window, preferably over 1.000 words, and hit submit. A value of 100 for the window size seemed like a good idea to me - values of under 50 and over 250 appear to work less well.
The resulting chart is drawn using Google’s Visualization API and I think it looks quite spiffy. Here are two examples
The first chapter of Edgar Allan Poe’s short novel Arthur Gordon Pym of Nantucket (source text, visualization)
How can the results be interpreted? The x axis represents the progression of the text - essentially we are moving through the text word by word from left to right. On the y axis three normalized scores a represented: the word-based variability between our two windows (VBDUdiff, light blue), the type-token ratio of the first window (TTR1, red) and the type-token ratio of the second window (TTR2, orange).
Great, so what does it all mean?
By itself, probably not too much. You’re unlikely to find a clear-cut correlation between shifts in topic or section transitions by looking at VBDU_diff peaks (those places where difference between the two windows is highest) only. Language is just too tricky for something that simple. But I can imagine there being interesting shifts in word class percentages and the like from one part of a text to the next. Integrating a part of speech tagger would be interesting, but that’s something I’ll save for another day.
In the mean time, try the script and let me know if you find something interesting. Visualizations are stored on the server for now and you can retrieve them later by using the URL at the bottom of each page.
* I need to note several things regarding my implementation of VBDU analysis:
I’ve reproduced the procedure from memory, meaning it is likely to differ from the original implementation in some form and may incorporate infelicities or errors
in addition to possible methodological flaws, simple programming bugs are also imaginable
as a result, use this at your own risk and do not cite or use this script in a serious context (i.e. publishing) without contacting me first
Two years ago, the news that Google was going to make available the largest collection of n-grams to the global research community that had ever been compiled sparked a lot of interest. I was among those who immediately ordered those six DVDs… and ever since they have been resting dutifully on a shelf in my office, collecting dust and reminding me that I need to bring them into a more accessible format. Alas, so many things to do, so little time.
Something led me to look for information on that corpus this morning and I came across this. Sadly, the link to Chris Harrison’s site no longer seems to work, but when I saw his visualization I immediately thought of Many Eyes.
My reasoning goes a little something like this:
Google N-gram corpus hosted on Google Palimpsest servers + IBM’s Many Eyes = Fantastic web-based tool for linguists
To elaborate: Google has a gigantic database of word collocations that can be used as a baseline for all sorts of interesting analysis, but you can’t really do any of these things unless you have a user interface and enough computing juice to sift through almost 100 gigabytes of text data on the fly. On the other hand, solutions like Many Eyes are amazing, but currently there’s no way you can use it with a really big data set like the n-gram corpus and therefore the research utility is limited.
But it must be possible somehow to bring together
the data to analyze
the computing power required and
the user interface needed to allow a non-technical person to interact with the data
and to put the whole thing on the Web. It’s Google’s stated intention to host data for us and they are the owner of the n-gram dataset, so I can’t imagine there being any licensing issues. And, as if to put a cherry on that sundae, here’s the announcement of a joint project by IBM, Google and the NSF to do exactly that kind of stuff. Put the 6 DVDs on a cloud, throw in a tweaked version of Many Eyes (think the word tree vis with a few extras) and construction grammarians everywhere will absolutely love it.
What do you think?
Google | IBM | Linguistics | Many Eyes | Visualization | comments | ↑top
Just because the subject came up in several contexts recently, I decided to make a screencast of me explaining the concept of f-score and applying it to some data from my corpus of company blogs. I tried to embed it in a blog post, but that caused several problems because the clip would neither fit nor scale for some reason.
Click here to view the screencast in a separate window. You can also download (right-click, save) and watch it in your favorite video player, which gives the additional luxury of being able to pause.
And apologies for my lapse of memory towards the end (which blogs am I comparing again?), but it was a long day and organizing a conference occupies a lot of brain cells. I hope it’s still informative.
Corporate Blogging | Delta | Johnson & Johnson | Marriott | Screencast | Style | Visualization | comments | ↑top
Last Wednesday, I had the opportunity to give a presentation on new forms of scholarly publishing, Open Access and Open Research at a virtual meeting organized by Catalina Danis of IBM’s Social Computing Group. It was great, although preciously little time for discussion remained, due to a slightly overambitious (i.e. too voluminous) presentation on my part. Thankfully, the session next week will be used for discussion and I am very much looking forward to that. Once more, a big thanks to Catalina for inviting me and to everyone who attended.
Edit: if this looks strange, please reload the page. For reasons I cannot fathom slideshare’s embeds manage to blow up the page unless I manually adjust the source code…
IBM | Open Access Publishing | Open Science | Presentation | comments | ↑top
When I started JNJBTW, I thought my audience would be pretty much those who write about the business of healthcare — reporters, editors, healthcare bloggers — those folks. What I’ve found, after doing this for a year, is that the people reading this are, well, er, people. Doctors, nurses, consumers — employees and retirees — people who hate the company and people who support what we do — friends, neighbors, my father-in-law… well, you get the idea.
Now those who have been blogging for a while may think, “well, duh!?” but for me it was an important point — particularly since I’m often asked “who is your audience?” My answer, which many people scoff at, is that it is everybody — that I don’t define my audience, but that the audience defines itself.
From my recent post about style and audience design in corporate blogs:
Blogs are a part of the Internet and the Internet provides virtually anyone with near-universal access to information. This may seem like a truism, but it has significant implications. Whereas before groups of stakeholder would be targeted individually and the flow of information was highly controlled, this is no longer the case in a networked world. A careful examination of the Google-Sicko story reveals a case of audience underfitting, i.e. a company employee addressing a specific audience but effectively reaching a much broader readership (and, in this case, not with a positive result).
The problem encountered is the extreme reach and transparency of online publishing. Because we are used to addressing either individuals or select communities of people, suddenly reaching a diffuse, invisible and potentially vast audience is not always easy to handle. This is especially problematic when you talk about people who are also your readers (see the Google example).
As the author of a corporate blog, one thing to never forget is that your audience defines itself (well said, Marc!) and that you need to write accordingly. Forget all the cozy rhetoric about blogs being “personal” and “open” and so forth for a moment. The key thing to keep in mind is that the word you identifies the person(s) whom you are addressing and that words like they, users, consumers, the public etc denotes those people whom you are not addressing. You are talking to the first group and about the second group. The unique aspect of blogs is that all those people that you conceptualize as being in the second group are also in the first, since anyone can potentially be a reader of your blog. The Google-Sicko example illustrates what happens in such a case: talking about someone who is part of the discourse is generally regarded as highly antisocial. In terms of language, we split the world into three parties: ourselves and those “with us” (I/we), our discourse partners (you) and everyone else (he/she/<name>). Making your reader feel treated as a third party is a mistake you don’t want to make.
Sometimes a picture says more than a thousand words - especially when the picture is rather fussy and complicated. I’ve created a map of corporate blog subtypes, the functions they realize and the audiences they address. It’s clearly idealized, but I think it captures the essentials reasonably well.
Have a look at it here. I couldn’t fit it into a blog entry because, as you can see, it takes up quite a bit of screen space.
Here’s one point in the piece that caught my attention in particular:
3. Being conversational is unnatural:
Being conversational is unnatural in business communications because we’ve been taught NOT to do it. Communication specialists are used to writing “Press Releases” and marketing web pages. The good news is that outside of work, employees are very good conversationalists, so they already know how to do it, they just need to break some of their Old Media habits. Training works very well in this area. Lastly, companies cannot forget the most important ingredient of a corporate blog — transparency. Corporate blogs are conversational and transparent, and therefore should NEVER be used to spew traditional marcom drivel.
I have been thinking about the style of blogs and corporate blogs in particular for almost two years now. The persistent chant ‘blogs are conversations’ and ‘conversational good, business-speak bad’ has a tendency to drive the professional linguist in me nuts, not because I don’t agree with these popular ideas, but because I keep wondering what exactly conversational means and why it is unequivocally regarded as ‘better’.
Now, as I am gradually approaching the completion of my thesis, I think can give a carefully weighed answer to that question.
Blogs are conversations? Partly yes, partly no
Firstly, when bloggers talk about ‘conversational’ what exactly do they mean?
Real-life conversations between human beings use many expressions that depend on the situational context to be understood. Things like that guy standing right there (so-called deictic expressions), false starts (And I was…. we didn’t go… No, Sue and I didn’t go to the meeting) and fillers (We need to… umm… discuss this in more detail) abound in face-to-face talk. Conversations also typically contains a lot of signals that serve purely to confirm and validate what your communicative partner is saying (things like yeah, okay, gotcha, right, uh-huh, nodding etc) and indicate your stance and social relationship. While conversations in TV shows, plays, novels and so forth are fast, witty and fluent, real conversations are often anything but - it’s just that we’re very good at ignoring all the noise they contain. We subconsciously filter out most of the static.
Blogs are obviously different in that blog entries are planned and not spontaneous (forget all the cutesy rhetoric associated with the word spontaneous for a moment - I use it to simply mean ‘instantly expressed’). Many bloggers, and most certainly the majority of corporate bloggers will read a post they have written thoroughly before publishing it. In the case of marketing and PR-oriented blogs and with executive blogs such as that of Jonathan Schwartz it is safe to assume that an entire team of communications professionals reads, discusses and edits posts collaboratively before they are published. There is planning and polishing involved, none of which is possible in real-time conversation.
So it’s not that aspect of blogs that makes us think of face-to-face conversations. What we associate with interpersonal communication is the interactive nature of blogs - in other words, that they enable a dialog between blogger and reader. Our reasoning goes: ‘I can respond to what someone writes in their blog, so it is basically like a conversation’. The other aspect is language; the content and style of writing that is associated with blogs. Note that point - blogs are written, not spoken language, which means that none of the ‘noise’ described above in occurs in them. Many things characteristic for spoken language never occur in blogs, especially not corporate ones.
Subjective as conversational
So apart from interactivity, what else is conversation-like about (corporate) blogs?
Have a look at this excerpt from One Louder, the blog of Microsoft staffing manager Heather Hamilton:
I’m not sure what has gotten into me other than the fact that I am happier than I have been for a VERY long time. It’s funny how sometimes things can just fall into place. The changes that I wanted to have happen at work happened without me doing much about it (other than saying “this is what I want”). I have finally started to spend some weekend time relaxing (and hanging with friends). And I am starting to believe what Eckhart Tolle says about coincidences not happening; it’s all for a reason (and with most of my life, I get the reasons for even some of the unpleasant things happening). Example: last week my manager and I were talking about me needing to travel to one of our dev centers. She recommended Ireland (oh yeah, I am totally doing that!) and I said “why don’t we have a dev center in Amsterdam? I really want to go there.” Then this week, I got an e-mail inviting me to speak at a conference in Amsterdam. How ’bout that? I’ve decided not to question what forces (if any) could be invovled with things like that happening. I’m just going to enjoy it.
In addition to business-related topics, Heather frequently writes about her personal feelings, thoughts and experiences in her blog, something that I’ve found to be typical of what I call ‘personal company blogs’. Such blogs are written by just one person, have a clearly visible reference to the blogger on the front page (name, photo) and are often part of a larger company blog hub (MSDN, in this case). In contrast to personal company blogs, team company blogs are usually about a specific product, issue or segment of the company and have several authors. I’ve found that writing about personal thoughts and feelings is less common in team blogs, largely because the topical focus of the blog tends to override personal concerns. By contrast, personal company blogs tend to be understood by their owners as diaries or journals where work-related subjects are integrated with personal thoughts.
The kind of language used in corporate contexts (pre-blogging) is fairly strictly focused on a fixed set of topics. To quote Mike:
The world of business found in real life language is a limited one made up of business people, companies, institutions, money, business events, places of business, time, modes of communication and vocabulary concerned with technology. The language found was surprisingly positive, with very few negative words featuring at all. It was also found to be dynamic and action-orientated and non-emotive.
What Mike found via his large database of language samples from real-life business settings was that corporate language largely centers on things associated with business, namely business people, companies, institutions, money, business events, places of business, time et cetera and that these things are generally presented positively (business is about getting things done, not about being self-reflexive or critical). Finally, the subjective emotions of stakeholders aren’t really very important - private matters don’t feature into corporate discourse in any significant way.
Now compare that to how Heather writes. It’s a world of difference.
In posts marked with the ‘personal blogging’ tag, Heather writes about aspects of everyday life that we are all familiar with: buying furniture and cleaning out the garage, cheering for a sports team and experiencing a blackout. Not everything is always positive - there are ups and downs. Heather’s language can certainly be described as ‘emotive’ or ‘involved’, not because it is necessarily always highly emotional, but because it is concerned with inner processes more than with actions. All of this is obviously in stark contrast to what language in most other corporate contexts looks like.
There are a number of reasons why a ‘conversational’ style in that sense of the word is typical for both non-corporate and personal company blogs and why I expect it to have an influence on how institutions communicate, present themselves and are perceived in the future. I’ll focus on three basic pillars: audience, content and style.
Who you talk to
Blogs are a part of the Internet and the Internet provides virtually anyone with near-universal access to information. This may seem like a truism, but it has significant implications. Whereas before groups of stakeholder would be targeted individually and the flow of information was highly controlled, this is no longer the case in a networked world. A careful examination of the Google-Sicko story reveals a case of audience underfitting, i.e. a company employee addressing a specific audience but effectively reaching a much broader readership (and, in this case, not with a positive result).
The problem encountered is the extreme reach and transparency of online publishing. Because we are used to addressing either individuals or select communities of people, suddenly reaching a diffuse, invisible and potentially vast audience is not always easy to handle. This is especially problematic when you talk about people who are also your readers (see the Google example).
What you talk about
One notable aspect of Heather’s blog (and many others like it) is how openly it presents personal thoughts, experiences and feelings to readers. This is not necessarily done just for the audience. It seems that many personal company bloggers, though quite aware that their blogs are public, write partly to record their thoughts for themselves much in the same way that diarists do. The blog is a chronicle of what the blogger has thought, felt and done over time, both personally and professionally. Not every personal detail imaginable is presented, but there is no strict (and artificial) separation of personal and professional topics. Independently of how bloggers conceptualize audience, the effect of sharing personal information is that it lays the foundation for relationship-building.
Being told the subjective impressions, thoughts and emotions of another human being is almost inevitably relevant to us because we value such social information very highly. Knowing personal aspects of someone’s life brings us closer to them and establishes ties which are the foundation of any interpersonal relationship. This is especially pivotal on the Internet where all voices are detached from the individuals who use them. Social information enables us to establish a relationship with someone whom we have never met, because what we know about someone allows us to draw an increasingly complete picture of what kind of person they are.
Social information as a universal currency is especially valuable in a globalized and networked world, because exchanging it builds trust and without trust the foundation for other interactions is lacking.
How you say it
There is a persistent belief that jargon, technical language and other forms of special purpose lingo exist purely to irritate those of us who don’t understand it. That’s not true quite true though - medical language or legalese may have that effect on people who aren’t doctors or lawyers, but among those who speak them these varieties are readily understood and used for plausible reasons. Jargon allows us to
delineate membership in an expert community (techies, lawyers, bloggers…)
describe aspects of our work/community/culture/shared experience with more perceived precision than ’standard’ language allows
In other words, we often feel that what we want to say is said more effectively when we use a specialized vocabulary developed to express it. While this is unproblematic as long as we are talking to others who share our knowledge, this instantly turns into an issue when we address a broader audience - which is inevitably the case with a blog. All of a sudden, use of a specialized terminology makes us aloof, arrogant and out of touch. Audience underfitting once again leads to problems, this time in stylistic terms.
Finally, ‘conversational’ in stylistic terms also implies the use of colloquialisms, figures of speech and other expressive elements which are typically found in spoken conversation. The effect of such devices is again that they allow blogger and audience to conceptualize the blog as a speech situation, amplifying feelings of solidarity and familiarity.
What ‘conversational’ can mean
To summarize, ‘conversational’ can mean a range of things when applied to blogs. Among them are:
interactivity - it can describe the dialogic structure of blogs and the possibility to respond to contributions
speaker and audience - it can describe the discourse situation that the blog creates on a technical level and the resulting possibility for the blogger to refer to himself/herself (”I”) and address his/her readers (”you”)
content - it can describe a focus on personal and everyday topics which are familiar to a broad audience and create a feeling of solidarity and familiarity with the blogger
style - it can describe the avoidance of jargon and technical language (due to its audience-restrictiveness) in favor of expressions that evoke spoken language and real-life conversation
As always, feedback is appreciated.
Corporate Blogging | Definitions | Google | Research | Style | comments | ↑top
I’ve recently discovered Project Bamboo, an initiative that describes itself on the project website as a multi-institutional, interdisciplinary, and inter-organizational effort that brings together researchers in arts and humanities, computer scientists, information scientists, librarians, and campus information technologists to tackle the question:
“How can we advance arts and humanities research through the development of shared technology services?”
Come again? At first, the concept of shared technology services may seem a little vague. But a closer look at the full project proposal makes it fairly clear what is meant.
While academics use digital technology and the Net for a wide variety of things (research, teaching, publishing, communication), all of these uses have a degree of improvisation to them. Very few of the tools we use are developed specifically for the context of science and research, and sometimes this limitation shows.
For example, I’ve started to use del.icio.us to tag all books I read in Google Books (see what I’ve recently tagged). Del.icio.us is an all-purpose bookmark management application, yet the ability to collaboratively create bibliographies with colleagues in the same subfield makes it a useful tool for researchers. Del.icio.us is not the only example - Google Documents can be used to collaboratively work on a publication and SlideShare is great for making your presentations available directly and linking them to your CV (see my own), instead of just offering them for download. But for other, more specialized tasks there is still a severe lack of tools.
A few months ago, a colleague of mine needed a corpus (a collection of texts for linguistic analysis) for her research. Corpora exist in a wide variety of shapes and sizes, but the specific issue she was working on made it necessary for her to create an entirely new corpus (built from blog texts) instead of working with material from more traditional sources (newspapers, fiction etc). In addition, she also had only a basic working knowledge of corpora and the ways in which they can be used.
We approached the problem from two different angles. I helped her build a specialized corpus by using a piece of software that I had developed for my own work on blogs. To analyze the data, I pointed her to two interesting functions of Many Eyes, a web-based application for visualizing statistical information: tag clouds and word trees.
Tag clouds (or, in this case, word clouds) make it possible to visualize how often a word occurs in a piece of writing. Simply paste a text into the appropriate form field on the site and Many Eyes will do the rest (have a look at this cloud for Shakespeare’s complete works for a nice example).
Word trees visualize textual data in another way, allowing the reader in essence to navigate from one word to the next.
There are of course specialized tools for corpus analysis that do a whole lot more than this in terms of statistics and Many Eyes lacks a whole range of feature that a genuine linguistic research tool would need (say, differentiating between different word classes). Yet Many Eyes has several advantages that the more specialized tools lack. It is
web-based
freely accessible
easy to use
and
versatile
In a sense, the points above make all the difference. Desktop-based software is under all sorts of constraints: you have to acquire it, install it and figure out how to get data from and to it, keep it up to date and do all sorts of other “chores” that have little to with your main objective. And then you can’t even share your data and collaborate as easily as you can on the Web. In other words, you’re using a program, not a service.
Of course Project Bamboo is not just about developing new tools (well, at least not in my mind). The assumption has long been that as soon as someone puts a useful service on the web, a user community will magically appear. This may be true of web video, blogging, wikis and many other services with a broad appeal, all of which can and should be used much more in academia. But with more specialized services, adoption is something that should be actively supported. In others words: we need to do more than just develop tools. We should work to popularize general-purpose services like del.icio.us and document ways in which they can be appropriated for research and teaching - and (most importantly) how they can be connected to one another. At the same time, just putting developers and researchers into a room together can produce impressive results.
A great example for both a mashup of services and a new way of looking at data is the Web version of the World Atlas of Language Structures (WALS). It’s a combination of Google Maps with the print version of the atlas, which shows the distribution of linguistic features across the world’s languages (say, which languages have definite articles). Not only is WALS Online more convenient to use than both the print version and the CD-ROM that comes with it (not to forget it is also free), but it makes entirely new uses possible. Think about collaborative annotation or linking research articles directly to WALS. Imagine an paper that lives on the Web and shows a map section from WALS in a side window, with the text flowing around it.
Developing services like WALS and getting them out there has the potential to completely transform academia in the long run, making it much collaborative and transparent than it is today. It will be exciting to see what role Project Bamboo plays in that context.
Last week I had the opportunity to have dinner with a group of very interesting and (and, dare I say it of researchers who rid the world of cancer and explore the origins of life?) fun people. Bjoern Brembs was nice enough to invite me to a get-together at the top of Berlin’s Fersehturm, some 230 meters above Alexanderplatz. The view was spectacular, though most of the time I was too caught up in the discussion to pay much attention. Catriona MacCallum, Martin Fenner, Randolph Nesse and Bjoern Brembs offered their views on where academic publishing is going and what is wrong with the system we currently live (and suffer) under.
Below are some of Bora’s photos, shamelessly ripped from his blog.
Mark the Chelsea fan and Catriona enjoying a cool Berliner Pilsener
Bjoern Brembs, apparently also a soccer handball enthusiast. Yeah, they don’t *throw* balls in soccer
Bora is up in arms against the less progressive elements of the publishing industry
Martin Fenner’s wife (to whom I apologize - my memory for names is terrible) and humble me
By the way, even if you don’t know a thing about evolutionary medicine or psychology, you should definitely have a look at Randolph Nesse’s blog.
Academic Publishing | Bjoern Brembs | Bora Zivkovic | Catriona MacCallum | Martin Fenner | Open Access Publishing | Randolph Nesse | comments | ↑top
Oh my, I didn’t manage to write a single post in the whole of April.
While blogging fatigue seems to be a widespread phenomenon, it’s a particularly soft spot when your research is largely about blogging and you are in the process of organizing a panel at an international conference concerned chiefly with personal Web publishing technologies and how they are changing how we talk about science - among other things.
But, as always, time is the essence. I’m starting to wonder what on Earth I was doing a year ago that allowed me to blog so much (or rather: what I was not doing). Perhaps that’s the wrong way of looking at it though. Yes, blogging takes a lot of time, but it very much depends on how you approach it. Maybe I’ve been a little too concerned with saying it all, i.e. with restricting myself to the planned, substantial and structured writing that we are accustomed to in other contexts . Blogging isn’t always like that and I believe that that’s a good thing. The minimal audience for a blog, as I love to repeat incessantly, is its author. In other words, a blog can be useful as a tool to systematically structure your thoughts - nothing more, nothing less. Forgetting about readership and self-reflexivity (i.e. thinking What is this good for? What goal am I trying to achieve?) can be exactly the right kind of self-motivating strategy. Don’t get me wrong - blogging with a purpose is great. But the luxury of having no specific purpose in mind can be a good thing sometimes, especially when you’re starting to feel that blog writing is actually a burden, a chore that you have to take care of. Obviously, when you’re writing for an institution or in a professional context you are well-advised to think of your readers. But if you’re not enjoying what you do it’s bound to show sooner or later and it seems that with blogging, much of the pleasure that people draw from the activity is a direct result of its unfocusedness - a sort of ‘my blog is my castle’-attitude in communicative terms.
A friend once told me she preferred the original way of blogging: “rambling incoherently to yourself on the street”. Blogging doesn’t have to be quite that bad, but sometimes it helps to ramble just a bit.