Leif Azzopardi | GoogleTechTalks
This is an example transcript with key editing/copying tools.
- Click the eye/folder button to switch between VIEW and EDIT MODE.
- Click the text in VIEW MODE to be taken directly to that point in the video.
- Click the text in EDIT MODE button to make changes.
- Click COPY to copy the transcript (with any changes) for pasting elsewhere.
use our life as a party from the University of Glasgow was visiting their recently told me about this cool work who is doing on applying micro economic theory to to understanding what users do what interactive users do with search and I thought wow that sounds pretty cool tells me it was inspired in part by some of house earlier work so we'll be meeting up with her later but i'm going to let life get on with it as in get on with it life great thank you David I'm really excited to be here and thank you very little bit about some of the work that I've been doing at the University of Glasgow so as they mentioned I'm trying to use microwave theory to trying to explain the interaction between a user and computer and so we can think about this in a more general terms of any type of person using a system and how they how they interact with it but essentially what we want to try and do is form some relationship between the interaction that people have with the system what they get out of the system or the benefit and the cost of those interactions from their point of view and so obviously how much they get out of the system in terms of the benefit depends on the interactions they perform and the interactions that they perform depends on the costs that they're willing to incur in those things so by using microeconomic theory we can provide a very nice way to relate these three things together and try and work out how users may may behave should behave or perhaps could behave given our system that we've designed because invariably system we've designed will shape how they interact with the system and today what I'm going to be talking about is essentially interactive search so sore topic retrieval where you're looking for multiple documents not just a one-off shot to find one document but you know 30 or 40 documents perhaps because like a you can think about information needs where you're searching for information about medical medical related conditions and you want to find out what the symptoms are available what treatments and what cures and what people think or you're looking for products so you want to find out what products are available what people think about those products and so forth or what I do i search for research papers or I want to find as many of the research papers possible or a patent examiner will search the search for a number of different documents and you for a sustained period of time so essentially they have an information need that the user reforms I want to find all the patterns relating to the this new cool feature I'm developing and the user will post queries to the system and then the system returns documents and so forth this cycle continues until they've found enough relevant information to stop and that satisfies their information need or the costs become too high or they give up for some other reason now in terms of the research that's been performed in this area what we find is that most of its observational and empirical so we know a lot about what users do and we'll see some examples of that in a minute but in terms of the theoretical front or formal models that we have available to us very limited essentially most the work turns into sort of more conceptual models about describing how users interact so Marcia Bates is very picking his very nice example where she uses the the metaphor that people go in a similar way I pose a query they look at some documents they choose the best ones and then go to the next sort of bushel and pick the berries from that one and go on in this sort of path we have other frameworks that are more conceptual like the information seeking retrieval framework by ingress and javelin and that sort of shows you how search is placed in a context like the work context and so forth but these are very descriptive they don't really they're not predictive any any nature one of the models you may have heard about that's much more predictive is the information in foraging theory by Pirelli which does try and relate the cost and the benefit given the interactions that a user performs in order to get relevant information mainly looking at in a browsing situation so here I'm going to be looking at I'll see interactive information retrieval and so I'm one of the major challenges in this area that being pointed out by belkin and javelin more recently is that we don't have models that really describe explain and crucially predict what a user will do given our system and so if we have had such a model than we could potentially reason a priori about the interactions and about how a user would behave which would be really nice and thus understand these relationships between interaction performance and cost what we do with that well this will s let us help us guide the design and development of information systems and potentially we could find out some laws of in how people interact systems so that's one of the challenges and so to sort of try and motivate this well we can think about well how to users behave and like I said we've got a lot of research you guys have probably got a lot of data to show how users behave in certain situations in certain contexts and so in terms of search we can we know things like well users like to close short queries two to three terms they can often pose a series of short cruise to try and find the document looking for if there are web searcher then you typically look at the first page of results and even then they may look at the first couple of or handful of results when we know that from the cliq three days patent searches on the other hand typically expressed quite a different type of behavior they'll look at maybe a hundred to two hundred documents per query they use a billing system so it's quite different from the same web search engine or a probabilistic search engine and they tend to express longer and more complicated cruise research-wise a canter and Smith is showing that users are quite adaptable then they're flexible you know the system is degraded doesn't work as well and they showed that the users will adapt to the system by opposing more queries to try and compensate for the lack of performance and on the other hand we know that users will rarely provide us with explicit feedback sometimes assume so I think you know these are all interesting observations but we don't really have any credible reasons why they do these things and so ultimately you perhaps we had a better on the thing we could explain wider behaviors wider users behave like this so let's take an example and this is to try and motivate why I'm looking at economic theory as a way that we could possibly describe the interaction between users and systems so we know that users like typing in short where is is that because they're lazy is it because they don't know any better but what we know from systems based research in terms of like when we use them say a probabilistic system it uses like say a sort of a fuzzy fuzzy combinations on all between three terms of thing we know that longer Kristin be much more effective so now you have this have this problem you just tend to short post short grease but they are a lot less effective along a cruise why don't they post on the queries and so a lot of people been trying to elicit more terms out of users I mean so for instance like the autocomplete there's a very good job of making a longer crew so they get more performance but let's have a look at a simulation that I read to try and explain this from an economic point of view so in 2009 I was doing some work and created a large-scale simulation where he posed thousands and thousands of queries to the system of different lengths sort of length 1 links to up to 9 30 and we got a certain performance so it doesn't really matter what that is it's just more or less how much relevant information they that they would have retrieved and so the total performance as you can see increases quite dramatically and then starts the tail off and by about 30 terms you know they're getting you know point point 45 out of 50 yes then yep so the performance was a mean average precision and on this particular one it was a probabilistic model could be in 25 so again this is for an ad hoc topic task like topic retrieval so you're trying to find many documents in response to your query and so as you can see it sort of tails off that's a total performance if you need is comparable across two different wavelengths they're just different information is so it's averaged over so he has a 50 topics and I average the performance over all those topic so it's the same set of topics and so for that topic I have a set of documents that are relevant for the particular corpus so this is like a newspaper corpus of about a few million documents and then I generate queries for that topic of length one of length 2 in length 3 so I use a career generation process thing so question and so this allows us to get an estimate here I haven't got the the standard deviation bars of the performance there but what's interesting about this is it's okay so we know that longer cruises are more effective we know that users a shorter queries which are a lot less effective so you're comparing point Forge 2.2 so from an economic analysis point of view promise so very interested in the change so what would happen if I add one extra term to it and if we do that then we can find the marginal performance so here we can see that when you go from zero query terms to one career term you get a quite an increase in formalin and one another term to another walk from one to two and so on and after you get to the point of two terms or three terms that you start to hit the point of diminishing returns right quite clean and it's exponentially decreasing so essentially what that's saying is that for every term I pose after two or three terms I'm getting less and less performance so what I believe this is like an economic justification for why users will tend to propose shorter queries because around two to three terms is where they get the most bang for their buck and so that kicks in so that's just a starting point this research of where i go to try and describe how we can use micro economic theory to try and model the search process now as a david or he mentioned I was kind of inspired by how Varian's work he gave a keynote at CAIR in 1999 and he talked about how economics could be useful for search so if we think about what microeconomics is it's constructed it's a consists of two main theories production theory and consumer theory and the idea here is that you know a firm or a customer to maximizing utility or the firm wants to minimize it costs we want to maximize that utility and while minimizing our costs and so it seems like a pre reason of citing place to consider one of the these theories and the one I thought was more appropriate with production theory and to give you sort of a lesson in sort of micro economics 101 then production theory also known as the theory of firms is essentially as follows you can think about a firm who wants to produce a series of output like widgets and they have a series of inputs capital and labor and they can vary the capital labour they want to put into the process the firm will utilize some type of technology to build these widgets and that technology that they employ will also constrain how much they can get out of the process use a more efficient technology you can get more out of the process so that just comes from actually variance sticky book on intermediate microeconomics what's nice here is that we can then start to model this so we can put capital on one axis and labor on the other axis so gives up to 22 inputs and we can describe a production function which essentially each curve represents a particular quality quantity we're not sought a number of output and that's a function of the input of capital and labor and so as you go up these production curves we're getting producing more and more output we're putting more input in and we're getting more out so every point along the curve is getting the same amount quantity but we're training off a capital for labor ok so in the top right-hand corner so for the green production function you can see that there's a production set and this is this in this region is you know like any point within that region will produce at least the quantity 3 right the frontier or the production function is what's critically in what an economist is really interested about this is the minimum set of inputs to produce a certain amount of output and so while you could still be we could be producing the same quantity and better at a more inefficient solution if we were in the right top right-hand corner there but what we want to know is to when we want the what is the best we can get out of the system and that's these frontiers and these production functions for each of these different quantities and so you might notice that obviously the closer we are to the bottom left here that means that we're producing we want to move to this bottom left corner which means we're using less inputs to reduce the amount of output and the technology governs or constrains this production set so the better the technology means we can use less inputs to create the same output so that's so I could quite a brief introduction to production theory and so now what I'd like to do is show how we can maybe apply production theory to model the interaction a search interaction process so let's have a look at the interactive scenario again and so in this view graph see we have the user and we can think about then but the queries that this meaning to the system is like an input mental capital if you will that they're they're utilizing to try and extract relevant information and another input would be well they look at the number of documents that come back and they make assessments so kind of like a the labor of physically going through and looking at each document going yes this is relevant to my information need and so here we can consider these things as two inputs to the firm and the output essentially is the amount of real information that their that they're producing so if we rebadged this and the same search as a production thing then here's our firm they're using the system together the technology that they employ of course is a search engine and so depending on what type of search engines are they're using this will either constrain how much input lamp to put into the system and our queries and the number of assessments they're willing to provide per query is the inputs and the output here is relevance or the gain or the utility that they obtained from the search process now this is not exactly like a widget or like production theory because introduction here we're actually producing a good or a service so here and essentially the metaphor kind of breaks down a little bit so it's kind of like the user is finding this relative information they consume this information and that's how they gain this utility so they're both the sort of producer of the information and the consumer approach you if you will and so similarly what we can do now is we can say these are our two imports the number of queries and the number of assessments per query that they're willing to give and describe some production functions and so the gain or let's say that the number of relevant documents they find is then can perhaps be described as a function of the number of queries and number of assessments per query and we can then get various levels of games so they find 10 relevant documents the blue line and 20 relevant documents along with the red line and 30 relevant documents along the green line and so it's interesting here because the user has a different options so they can either pose a few queries and look at a lot of documents or go to the other end of the stream extreme and pose lots of queries and look at a few documents per query or anywhere along that line any point along that line will produce the same amount of relevant information for them so of course now they've got a decision about what they'd like to do and what they'd like to achieve and there's a cost associated with their actions so we could essentially map or cost function to this to describe to try and work out the optimal point but I'm getting ahead of myself to you so essentially these curves represent how well a system could be used and so it's like saying this is the minimum amount of input required to achieve that level of gain so in practice of course a user might not be able to achieve this because they might not be able to use the technology efficiently and they'll be further back away from that front team but in terms of like I said like in terms of economists it's very interesting if you work out what is the best that you could get out of the system so like I mentioned there are a range of strategies that a user could employ to get these ten relevant documents or 20 relevant documents so I'm not saying a user does come to the computer going we're going to do it's probably more subconscious that they're through their direction of the system that they've evolved some strategy and potentially a cost minimizing strategy to get the relevant information so they might think to themselves should I pose a few queries and assess lots of documents per query similar to what a patent search adults should I post lots of queries and assess a few documents like a web search or should i do some other combination along the possible possible group and so like I said if we we map map a cost function to their interactions then we can then consider what would be the most cost efficient way for a user to interact with the system so I've presented a model a very simplistic model admittedly but I feel that it's sort of representative of the search process and it's trying to say that the game that they are person gets from it is it is just by simply a function of the queries and the assessments that perf weary that they do and while it's sort of abstracted is representative how people search of course you know you could say well not people don't fix the number of documents they look at per query they vary it you know sometimes I look at more document sometimes a little bit less but if you think about it like on average a user will look at five documents or ten documents per query and so it's pretty pretty reasonable in that sense and there are certainly ways that we can make this more complicated but let's see how far we can get with this simple model and so what does this tell us about people's search behavior or and the way they they interact with the system so in order for us to do this what we need to be able to do is estimate these production functions so that we can see if there if there is a trade-off between querying and a sense and principles so here's a set of experiments that I have developed a simulation essentially so the tasks that that I was looking at was again interactive I are so there was a number of relevant documents at least 30 or 40 relevant documents per topic up to 200 or so possible documents per topic and we had about 50 50 or so topics / collection I use the number of newspaper collections of a few million documents now their goal was to find news articles about for instance Airbus subsidies by European governments tropical storms where people were killed or cases of inside a train the goal of their situation he was to try and find as many relevant documents to reach a certain number of game and then we would record that so the output is down or the number of relevant documents and the two inputs are again the queries and the number of assessments / queries and so I develop a simulation which is built in C++ and using a limit toolkit to carry out experiments to try and evaluate or develop these production curves so the simulation so obviously I've replaced a real user with a nice automated user and we've developed this agent to now interact with the system we have a set of topics for instance and we know the relevant documents and what we assume in this is that the simulated user has perfect knowledge of the situation and from that perfect knowledge they can then generate really really good queries from that so we see that they are simulated users with these queries from a relevant set and because I have perfect knowledge that which queries the best query to submit first they enter that to the system and through this simulation I've done it with other links is that they issued or generated queries of linked three so about the average length of queries point they issue this to the system and then they then we fix X so we say oh this users only a little bit five documents / per query and then we will repeat the cycle again and again and again until they find the desired gain level like 10 20 30 or whatever percentage of document phoner you're looking for and so we keep this and we we do this further like 5 10 15 20 or so forth up to like 500 documents in the simulation and so then we would record the number of documents and the number of queries for each level again so that we could plot this current out and so essentially this simulation assumes that the user has perfect information and this is kind of nice because what it's telling us is that this is how good the system could be used or how well of the perfect user could use a system it's not necessarily what they do in practice but it gives us the upper bound of how good the system is so let's have a look at some of the production curves that were then produced by the simulation so what's first have been first of interest is the shape of the curves they look very much like the ones we had before it wasn't a straight line there's not a direct trade off between querying and assesses curve and so here this is for one retrieval model called bn25 that's just a probabilistic retrieval algorithm and we have two different levels of games so the purple one is like when they get point to gain out of one so that's about 20 they find about twenty percent of the relevant documents and then they move up to a point for game which is like about funny about forty percent of the relevant documents I've used normalize gain here because some topics have different levels of different numbers of relevant documents so what's interesting here is for instance that at a gain of point2 we have a number of different options so for instance of user could submit about eight queries and look at five queries per per five assessments per query and get the same gain as if they issued about for queries and looked at fifteen assessments per query on the other hand if they wanted to get more game or get to a high level find out forty percent of the documents then they would have to post a eight queries and fifteen documents per query or for queries and forty document so you can see there's an increase their obviously they're going to put more into get more out the system naturally and what's kind of interesting is that in order for them to sort of double their game so you can think about they would have to essentially double the number of assessments or double the number of queries or more than double actually the number of queries that to go to a higher level Oh mr. set of production curves I've now produced I've got three different models one is the probabilistic model one is a berlin-based model so we're using an implicit end between the query terms and tf-idf which is a vector space model so it looks like the angle between the documents and the and the query to generate a ranking of the documents and in terms of like the mean a group decision or the overall performance of these models in a lab based study what we generally find is that BM 25 comes up on top too IDF is next and bullying doesn't perform very well at all but I just know they're just gives us a number what these graphs shows is the different different types of interaction that you'd have to perform in order to find a certain number of documents here about forty percent of the documents and I think this was a really wow moment for me in particular because what it shows that there's quite a different level of interaction that's required to get the same number of documents one but also that you notice that some of these curves don't go all the way around so first of all if there's require a lot more interaction on both a ruling model and tf-idf so they're pretty inefficient that sense and what we see here is that there's no rule there's no combinations with the depth or less than 120 and hear about 40 for the blue a model that exists which would allow you to get to that level of gain it just doesn't exist right you can't do it so we're using the best queries possible to try and buy these documents what you need to do is go to a deeper depth in order to to get that level of game so what it shows you is that for the m25 it provides users with a range of strategies in order to accomplish their task of finding forty percent of the documents but if you were to give them say the the the TF IDF model the green line then they would have to assess at least 125 documents in order to get that level of game they don't have a choice in the matter David decision or do you see the same sort of behavior yeah so that's a good question so if we went to a lower level of gain what would see that these these production curves are like in there like in the previous one would get closer to the origin here so obviously like if it's a one-shot query then the best you can do is submit and you're only looking for one document you'd have one query and what and one assessment which be the first one so you just need a little dot Corner that obviously we're looking at interactive retrieval so you want to put more documents and and so as you get further and further to higher levels of gain or high levels of recall then you have to do more interaction with the system another interesting point like said this work by Kanter and Smith you showed that users can adapt to systems and so here we can imagine that BM 25 is our good system whereas Berlin is a system that's a bit degraded it's not as good as being 25 and so the user adaption here you can see that the m25 you'd pose five crews and if the user wonders and you look at 25 documents per query but if they move to Berlin then the way that have to compensate is to issue 10 queries at the same number of assessments so the in order to compensate that he would have to issue more queries to adapt to that created system so I think it's a nice way we can see from the curb themselves is how a user would have to adapt to changes in different systems so that's all well and good we've got these empirically estimated curves but what would be really nice is if we can find a mathematical formulation for these and so in economics they often use the cobb-douglas production function generally because it has a nice mathematical properties it's easy to differentiate and so forth and essentially it's a function of the two inputs that queries of the number of assessments per query we have K which is essentially the efficiency of the technology so the higher the level of K the more efficient the technology the more relevance you can find for less input and we have a mixing parameter which between 10 which is determined by the technology that we're using and so what I did then is estimate from the curves so this is an example when the gain is of 26 for each of the different model so here's an example where you can see that the being 15 is a much more efficient technology the k's very high tf-idf one is going low ones that and we see that the Alpha parameter there is slightly higher for them 25 which just means that we tend to favor querying over assessing for that particular model and similarly with brewing and you can see there the goodness of bits is really quite hard that was significantly so so that was like really great we can now we can mathematically model of the interaction given for a system and so this little this allows us to differentiate and find out things like the marginal product or the changing game when we're careering so if i if i added one extra query to my process how much more gain what I get and on the other hand if I looked at one more document per query than the marginal product of assessing would tell me how much more Gaynor and get for that process and if we look at those together then we can consider the technical rate of substitution which is essentially how much I have to give up of one for another so here about the how many more assessments per query and needed if I was to close one less query and at each point it so from at the top point there i would have to i don't have to assess like point for more documents if i gave up on average per query if i gave up one query and if i move down to the green box area point so i was at six i was to move down to five queries then i'd have to assess another four point two documents per query to as as there is a trade-off there and so here's an example so that i just mentioned and so if i was to move there so at six queries and twenty documents per query then i have to look at 120 document okay and if i went to five queries at 24.2 then i'd have to look at about 121 documents to get the same level of game so this is really interesting point it's really balancing point of that there now which one is better right should I pose one less query and look at one more document or should I pose one more query and look at one less document well of course this all depends on the cost of assessing and the cost of query if query cost is very high because I can't think of a query for instance then the opposing five cruisers probably a better solution so that sort of brings us to the point where what about the cost of interaction so which is a better point along this curve to interact with which is the optimum or what's the optimal and so the way I try to resolve this was to provide a user cost function and so here's the cost function that I use is a simple linear model as a starting point and so again we have our two inputs and the Q times the a is essentially the total number of documents I want to look at and I'm assuming that there's a unit cost to this each document cost of one so it's an abstract cost and beta then represents the relative cost of a query to an assessment so if beta is greater than one that's saying queries are much more expensive than assessing and if it's less than one it's saying that query is cheaper to pose then looking at a document now so far I've talked about costs in a very abstract way we can think about lots of different costs so they could be the mental cost of trying to physically a mental cost of thinking about the query and the mental cost of reading to a document there could be the physical cost of actually typing away temporal costs about like you know many of God's analysis and that's been how much time we go through and do these things but for the purpose of this simulation is analysis what I resorted to was looking at the work by give its Kerr who did some dual dual task measurements when people were searching the web to try and find out or ascertain the cost of querying and the cost of assessing so the jute are likely gave of a secondary task rather were searching and this is the reaction times while they were doing those particular tasks right and so when they were querying it took them a lot longer to come out to two thousand six hundred milliseconds whereas we're know assessing it to promote so to find a relative cost I just divided learn the them together together a relative costs about 1.15 and so that's what I use this model of course this would depend on users and lots of other factors and we probably want to measure this a lot more precisely it's a good starting point so now that we have a cost function in a way to estimate the the relative cost let's have a look at what what might evolve here so what would be a cost efficient strategy so on the top there I have the same production curves that I showed you before and look for being 25 at point 6 is the red curve and for being 25 at point 4 gain I have the blue line showing you the number of assessments look we're in the bottom on the x-axis and then i'm a prisoner y axis and below I've plotted the cost and what you can see here is that the minimum cost point is around about 15 15 assessments per query and if I wanted to go from point for Game two point six then it'd still be the minimum at 15 documents and so it's better for me to just pose more queries if I need to increase my game so to Chris McCain pose more queries as opposed to looking at more documents using this particular system use a different system however go to the burly model and again I've got the same same day levels four point four point six and ability model there then the minimum cost points just happened to be the end points and so here if I wanted to increase my game rather than closing more queries I need to look deeper in this rain in these lists and move from you know about 25 documents down to about a hundred documents per query in order to increase my game from forty percent to sixty percent so it's quite a different way that a user has to interact with the system depending on the retrieval model that's used I think this was really interesting for me because like patent searches will generally look at about 100 to 200 documents per query right and so this is also learning and they want I recall so you know when we look at the higher equal of point six you can see that's between 100 to 200 documents there so if it's kind of nicely with our intuitions about all its nicely with the observations that we're observing similarly with the BM 25 model it shows that you know for a much more performant model that it's better to like they look at les documents per query now if we contrast these two systems together then what we can see is that like on beam 25 you'd issue a lot more crews so looking at 40 queries compared to 10 but you does that examine a lot less documents per query and conversely bitterly and also you can see that the beam 25 is a much more efficient technology the abstract cost that we have here you know it's kind of 330 compared to 1100 so there's actually quite a lot of difference here between these two different systems now let's have a hypothetical experiment so let's imagine that we will barring this beta parameter and let's say that the cost of queries go down so let's say we move from a just a standard query box to one where the spelling it corrects your spelling so it means i can type faster and I can make mistakes that we're still correct that or I even move to autocomplete so that's making the cost of queries cheaper so I can either type them in faster I don't think as much so the cost of Korean comes down then what we would expect from our model is that users would issue more queries because we reduced the cost and as a result they would decrease the number of assessments per query and conversely if the cost of querying went up then they'll be a decrease in queries issued because now it's really hard to formulate a query that would get something out of it so either make it very hard for them or very slow to type in a query well maybe the use of themselves is not very familiar with the topic and so they would be thinking in their mind or this is really hard I don't I can't work out any and any query that would be really good here in a good example of this is image retrieval where you type in a word you're looking for a picture you can't really think about a very good very good query to pose also in that situation documents themselves are very cheap to assess because you just visually analyze them so the relative cost of querying is very very low and thus it's probably better to issued less queries and look at more documents per query just want to back up this claim in this hypothetical experiment here's like a small analysis here as we go from the blue curve at the bottom to look over the top I'm increasing the beta parameter from 1 10 20 and 30 I know in the graph that shows a knave that's going to be a better the red line indicates the minimum cost of each point so as I move from a one-to-one mapping to queries being 30 times more expensive than assessing a document you can see that the minimum cost is is at a point which is at a higher level of assessments so as queering costs go up we want to assess more and more documents per query and consequently or issue less queries that so sort of summer and one of the thinking that the implications are design of these these types of models so he represented only one model for a particular scenario but we can mighty ever generalize this to other types of human-computer interactions where there is a performance or a benefit of cost in the interaction and so by knowing this so these models provide us with a very compact and neat way that describes user behavior and it gives us a nice way to reason about how users may change their behavior so we can theorize about how the system will affect their interactions about a new feature or if i change the performance level what have you and then we can work out whether is this desirable do we want the users interaction to change in this way do we want them to type in more queries you know think about got the google instant feature that you know you're talking a letter and querying for you automatically unless they're obviously looking at les documents / / query or do we want them to accessible what's better for us we can potentially categorize the type of user are they actually are they acting rationally are they minimizing their costs when they're using the system or are they behaving in a more irrational manner so some other point along the curve and how can we help them to get to a more cost efficient solution and we can also scrutinize the introduction of new features often we're trying to adding new features to our search engine or social systems or interfaces but are they going to be of any use so here we can tension form a simulation to examine say for instance the use of relevance feedback and seeing why aren't users marketing documents is that because the performance gains that are getting is not worth effort they're putting in and so we can evaluate these a priority and think consider how they could how they're going to affect the user's interaction so in terms of the future directions so I've been looking at this I've been looking at trying to validate this theory because obviously it's just a theory so I've been going out and conducting empirical experiments with users to try and validate these assumptions they seem to home tree I've been looking at different ways that we can incorporate different types of interactions that we have with the system so find similar relevance feedback browsing faster than searching within the within the mix and different types of queries that a user could pose and of course like we have only developed very simple cost models so if we develop more sophisticated cross models and we're able to accurately measure the cognitive fossil or the other physical costs that are involved no would we get different where we get different solutions here and of course we even look how given the analogy how can we make nap with two other types of search tasks or even other types of human-computer interactions so that basically concludes my talk any questions years her to forecast document collections on one side feels nice them about the ice cream on search deputy you're in these people and search engine optimization that so that's an interesting question so essentially you're saying that because say something like in the context of patent search pattern riders are trying to obscure their documents and yeah there may be hard to find yes that's right whereas search engine optimizers are making them more relevant apart I haven't really thought about that specific question so I mean what that's going to affect is whether you can get to that height those higher levels of gain so this if that documents hidden somewhere in a corpus it's going to mean you have a really query hard to try and find that document and one of the examples of that was like the viagra pattern was the way they spelt the chemical name was actually incorrect so chemical searchers we're trying to search for the chemical name it's not coming up with a hit man to put that put a F in there until the ph right so that means especially its increasing the query cost for that user so that's not necessarily desirable but yeah so that would that kind of allows to the fact that this is kind of what the these cursor we're estimating is when the user has perfect information they actually users don't have perfect information in practice so the what they're able to get out of the system is not necessary optimal so how far are they away from that and so we might so it would be very interesting to start modeling how users are behaving and to see what levels of performance they can get or what what's their production function I know it didn't really answer your question specifically but perfect so the way that I yeah so the essential question was how did I create a user for the simulations and how do i generate the queries for them so what we did was for each the topics I have a reference set of relevant material right and so from each of those documents i created a language model all right so i looked at the the probability of a term appearing in that document and then essentially drew out the terms the most probable one first and the next probable and the next model after that given the language models that we used so i use the number of different types of language models that weight of the terms either by the term frequency or by how distinctive that term was in the document or a combination of so like a term frequency inverse document frequency and it allows to draw out the the most probable term from the document server generates a really really nice query and often will sri the that document quite highly along with a number of other associated documents and so i assume that the perfect user would then be able to out of all these queries that are generated be able to pick the one that that that retrieves the most relevant information first it's like a greedy agree approach so they take the first document and then but first query that might retrieve 10 documents out of 10 right then they then they get to choose the next one the next document which gets 10 other documents so it increases their game and then so on and so forth through the process I did vary the different types of query generation methods that are used and there's just one example here they all get very very similar results but what it does mean is that we're making an assumption that the user is rational and they do have perfect information so it means that they experience diminishing return for each query so but the the enth query they pose is getting maybe one relevant document back and plus 11 relevant until they get no more relative document act users don't behave like that in practice so the empirical experiments that have been conducting show quite a different situation where when we looked over the queries that opposing it was almost like they were getting increasing games so the first couple of queries were kind of poor that was still learning about the topic and then they then they sort of improve their game and this is a better query ah now learning a bit more about it and then posed another better query so instead of looking at decreasing games like that they were getting increasing games or kind of keeping a linear approach so that was quite an interesting observation there's another talk you can invite me back next year for that doing it says they're working off the curve to begin with until late they should have become this sort of quasi knowledgeable user yes and they they kind of move on to the curve and then they start to get excited so if you hadn't work so that's right so that's right they are act they are learning in the process of the query to become a sort of an expert about it and so you can imagine like you if you've you've done a lot of work about a particular topic then you can probably Express really really good crews first and get the bulk of the relevant information back and then so on but if you don't know much about the topic then you ask having a shot in the dark and trying to learn about it and which i think is really really interesting to see how users learn in the process are they experiencing diminishing returns which case we could probably infer that they are an expert user and if they're increasing returns then we point further actually just learning about the topic the subsequent time they come back and try and maybe they re finding then you'd expect them and they'll get dimension returns up to that you
Completed work is marked by the “END OF TRANSCRIPT” sign at the bottom of every page.
--- END OF TRANSCRIPT ---