Selection of a sparse parameter using a genetic algorithm
I am facing a parameter selection problem that I would like to solve with the Genetic Algorithm (GA). I have to choose no more than 4 parameters out of 3000 possible. Using a binary chromosome representation seems like a natural choice. The scoring function penalizes too many "selected" attributes, and if the number of attributes is acceptable, it then evaluates the selection.
The problem is that in these rare conditions the GA is unlikely to improve the population. Neither the average cost of fitness nor the fitness of the "worst" person improves over generations. All I see is a small (even tiny) improvement in the top person's score, which I believe is the result of a random sample.
Coding the problem using parameter indices also doesn't work. This is most likely due to the fact that chromosomes are directional, while the problem of choice is not (for example, chromosomes [1, 2, 3, 4], [4, 3, 2, 1], [3, 2, 4 , 1], etc. Identical)
What representation of the problem would you suggest?
PS If it matters, I am using PyEvolve .
a source to share
I am not familiar with PyEvolve, but from what I can think of genetic algorithms you are taking 4 steps,
- Chormosome Evalutation (you probably figured this out by now)
- Chromosome initialization
- Crossover crossover
- Chromosome mutation
I think you can do this with lists just fine. You will need to overload some operators, but it looks like PyEvolve allows you to do this . Simple would be to keep the list view, just sort them numerically before returning the chromosome.
I need to know more about your problem, but here are my suggestions. Since you have a variable number of parameters, you need to come up with some kind of probability distribution for the number of parameters in the chromosome. I'll take a single random case at 1,2,3,4 here, but you can try something else if you like it. I'm going to call this distribution P_n.
- Initialization. The seed of your population is (at least) 3000 chromosomes. Name these c_1, ..., c_3000. Draw n_j from P_n. put j in c_j. Select the remaining parameters n_j - 1 with a uniform random distribution from the rest of the parameters.
- Crossover. Let's say we have two chromosomes. C_1 and C_2. We're going to create (and return) chromosome C_3. Choose n_3 out of {n_1, n_2} with probability 1/2. Now put the parameters C_1 and C_2 in one list (and they are unique, so if both C_1 and C_2 contain parameter 1, it is only in the list once). Draw n_3 parameters from the list of joints and place them on chromosome C_3.
- Mutation. Given Chromosome C_1, draw n_1 * from P_n. if n_1 * n_1, randomly remove elements from C_1 until there are n_1 * elements. If n_1 * = n_1 randomly selects 1 element from C_1 and replaces it with a parameter chosen at random from those that are not in C_1. If n_1 *> n_1 randomly add items to C_1 until there is size n_1 *.
There are now many ways to work around this, so do what matters most to your problem.
a source to share
I think it might be more appropriate here to decompose the singular value ( http://en.wikipedia.org/wiki/Singular_value_decomposition ) given the limitation on the number of parameters.
a source to share