Samples / Help for Location Process

AlexO
New Altair Community Member
Hello,
I have the following task to explore:
We want to predict the position of a WiFi Client in a certain room. We have the positions of the Access-Points and the RSSI-Values (WLAN field strength).
I watched the tutorials 1-8 on youtube and tested with the decision tree in the Studio 7.1. I am a beginner in datamining and it is hard for me to rate if this is the right way. ???
Has anybody samples for the given task, or for similar tasks?
Is the "decision tree" the right process in the Studio to get a good result?
Thank you!
Regards
AlexO
I have the following task to explore:
We want to predict the position of a WiFi Client in a certain room. We have the positions of the Access-Points and the RSSI-Values (WLAN field strength).
I watched the tutorials 1-8 on youtube and tested with the decision tree in the Studio 7.1. I am a beginner in datamining and it is hard for me to rate if this is the right way. ???
Has anybody samples for the given task, or for similar tasks?
Is the "decision tree" the right process in the Studio to get a good result?
Thank you!
Regards
AlexO
0
Answers
-
Alex,
thanks for trying out RapidMiner! I think there are some ways you can get better.
First of all most problems of data science are about representation of the data. How does you table look like? I assume you have something like:
Truth WIFI-Strength1, WIFI-Strength2, WIFI-Strength3
etc? Thinking about a useful representation is key.
My peronal feeling (if you have some similar representation) is, that a different model might be better. My feeling says that a Logistic Regression or SVM in a Polynominal by Binominal Classification operator might make sense.
Can you please tell us a bit more about the structure of your data?
Best,
Martin0 -
There is an academic paper somewhere that used RapidMiner to calculate the location of people using wifi signal strengh. I'm not sure where it is though.
Try search google scholar for RapidMiner + wifi or signal strength that should give you some pointers.0 -
Hello JEdward,
unfortunately I could not find this paper. Thank you anyway.
Alex0 -
Hi Martin,
thanks for bolster me up. The question for the data is answered fast: I am free! I could define the data which I need.
What I will/should have is:
- The count of Access-Points (e.g. 5). Data 1 .. n
- The borders of the room I have to predict (e.g. a quad of 50x50 meters).
- "Learning data" (I am not sure how the position should be represented...)
--> I want to teach the System before any prediction
- RSSI (field strength) + Position for the Learning data
- RSSI (filed strength) without Position for the prediction
That's it.
I will be glad about freedback.
Regards
Alex0 -
What will be your target variable? The room the device is in, or actual X/Y coordinates? In the first case it's a classification problem, in the second it's regression. (Which could be used for classification if you have a "map" of the building - then you can calculate the room from the predicted coordinates).
You'll probably make measurements on defined points of the building and record the coordinates or the room identifier as the target variable (label). Then you can build models from this data and apply them to new data.
You'll have a variable number of RSSIs. This is usually not easy to express in RapidMiner. So you'll probably filter for the top 3 or 5 signals and use the Pivot operator to transform the dataset so it only has one record per reading.0 -
the target will be coordinates. Coordinates could be X/Y or Geodata. With both you can get a resolution of 1 m.
The variable number of RSSI's is by design. There a many effects which can change the RSSI...
So is Rapidminer the wrong projection??
0 -
No, that's not what I meant. RapidMiner is of course a good solution for this problem. You just have to be smart when preparing the data.
Models just need to have a fixed attribute schema (in each product). They can't work with non-tabular data. Many algorithms also can't work with missing data (this is again conceptual, not a RapidMiner limitation).
Some possible solutions:
- If you have a fixed number of stations installed, your table could be like this:
Measurement ID; Position; Station1; Station2; ... StationN
If no signal strength of Station5 is available, you just put 0 into it.
RapidMiner can work well with a huge number of attributes, and the structure can be automatically created e. g. with the Pivot operator.
- If the number of stations is not fixed and higher than you'd like to express in the previous data structure, you could go with this:
Measurement ID; Position; Top1StationID; Top1StationStrength; Top2StationID; Top2StationStrength; ... as long as it makes sense.
Your ultimate requirement is to express each "example" (measurement, position) in one row in a tabular data structure. That's it.
I would guess that the first representation is easier to work with and it's also better suited for most modeling algorithms.0 -
Thank you Balázs0