Archive Page 3

Some techno-based mythical mambo jambo in a New York times article about computerized, high frequency, stock trading.

High-frequency traders often confound other investors by issuing and then canceling orders almost simultaneously. Loopholes in market rules give high-speed investors an early glance at how others are trading. And their computers can essentially bully slower investors into giving up profits — and then disappear before anyone even knows they were there.

That is a little too dramatic, maybe. One interesting question, is there a mahematical proof that high frequency can yields high revenue? Is it based just on exploiting some technical loopholes or is there something more fundamental here?

Red for danger, green for?

Sometimes, its funny to see how red and green color codes are used in software. As red/green schemas are basically taken from traffic lights, we can imagine that red would be used for dangerous situations (stop!) and green for safe situations (move on). Like in Microsoft’s personal firewall:

microsoft firewall

But sometimes, color codes are basically used for what’s good for the software service. For example, in Brightkite, a location-sharing application, this is how full location disclosure (posted to a public web page) looks like:

screen-capture-1

and this is how a private mode, which is “safe” looks like:

screen-capture-2

So, can people trust these color codes? Probably not.

Locaccino trail

Osnat's trail in the park

Osnat's trail in the park

. Osnat’s trail in the park, as seen on Locaccino

Categorizing Web Services

Recently, a paper by Aviv Segev and me was accepted to IEEE Transactions on Services Computing. The article, with the rather long name of “Context-Based Matching and Ranking of Web Services for Composition” tries to challenge a simple problem. Given a Web service in WSDL (short documents that describe Web services) and a set of ontology domains, to which ontology domains does this service belongs, and how strong is this “belongness”.

The simple solution is to take the words in the WSDL, to take the words in the ontology, and to compare them. This is how documents are classified to topics in a text corpus. The problem is that the WSDL documents, that describe Web services, contain very little text. At the worst case, they can contain just the names of the input and output parameters. In this paper, we searched for a solution on the Web. We took the words form the WSDL descriptions and thrown them to search engines. We collected words from the results, and ranked these words according to the number of times they appeared for different elements of the service. For example, if a service input parameter was “zip” and the output was “address”, and running these words in a search engine returned words such as “location” (due to both zip and address) and “compression” (due to zip, such as in win-zip), we ranked “location” higher than “compression”. We then took those words, added them to the WSDL words and used all of them in the comparison process.

We had compared our method to simple string matching, to tf/idf and to a number of other methods. We found out that our method and tf/idf composed together outperformed other every single method.

« Previous PageNext Page »


About this Blog

This blog is a place for half-baked ideas about research, computers, robots, AI, and whatever. My name is Eran Toch, and I am a post doctoral fellow at Carnegie-Mellon University. For more info, see my homepage.

 Subscribe to RSS Feed

 Subscribe by Email (you can always unsubscribe)

a